A call for better methods for evaluating educational apps
Richard Culatta and Katrina Stevens, U.S. Department of Education
As increasingly more apps and digital tools for education become available, families and teachers are rightly asking how they can know if an app actually lives up to the claims made by its creators. The field of educational technology changes rapidly with apps launched daily; app creators often claim that their technologies are effective when there is no high-quality evidence to support these claims. Every app sounds world-changing in its app store description, but how do we know if an app really makes a difference for teaching and learning?
In the past, we’ve used traditional multi-year, one-shot research studies. These studies go something like this: one group of students gets to use the app (treatment group) while another group of students doesn’t (control group). Other variables are controlled for as best as possible. After a year or so, both groups of students are tested and compared. If the group that used the app did better on the assessment than the group that didn’t, we know with some degree of confidence that the app makes a difference. This traditional approach is appropriate in many circumstances, but just does not work well in the rapidly changing world of educational technology for a variety of reasons.
1) Takes too long
Waiting as long as two years to know whether or not an app helps students learn is simply too long — apps are often updated on a weekly or monthly basis as new features are added, bugs are fixed, and user feedback is implemented. The app measured at the start of a traditional multi-year study may be a completely different app by the time the study is finished, making the results of the study irrelevant.
2) Costs too much and can’t keep up
The complete development costs for many educational apps are a fraction of the cost for conducting traditional educational research studies. It wouldn’t be economically feasible for most app creators (or schools) to spend $250k (a low price tag for traditional educational research) to evaluate the effectiveness of an app that only cost a total of $50k to build. Even if cost was not an issue, there is also a logistical problem with applying traditional research methods to evaluating educational apps; traditional research methods simply can’t keep up with the ever increasing number of apps.
3) Not iterative
Traditional research approaches often make a single estimate of effectiveness; the treatment either worked or it didn’t. But apps aren’t static interventions. Apps are built iteratively — over time functionality is added or modified. A research approach that studies apps should also cycle with the design iterations of the app and show whether an app is improving over time. Similarly, snapshot data often doesn’t fully capture the context of an app’s implementation over a period of time.
4) Different purpose
Traditional research approaches are useful in demonstrating causal connections. Rapid cycle tech evaluations have a different purpose. Most school leaders, for example, don’t require absolute certainty that an app is the key factor for improving student achievement. Instead, they want to know if an app is likely to work with their students and teachers. If a tool’s use is limited to an after-school program, for example, the evaluation could be adjusted to meet this more targeted need in these cases. The collection of some evidence is better than no evidence and definitely better than an over-reliance on the opinions of a small group of peers or well-designed marketing materials.
The important questions to be asked of an app or tool are: does it work? with whom? and in what circumstances? Some tools work better with different populations; educators want to know if a study included students and schools similar to their own to know if the tool will likely work in their situations.
There is a pressing need for low-cost, quick turnaround evaluations. Two years ago the President announced the ConnectED Initiative which called on public and private sectors alike to work together to improve internet connectivity to schools across the country. Today, thanks to wide bipartisan and cross-sector support, significant funding is becoming available to help schools close the connectivity gap. This includes a one-time $2 billion investment by the Federal Communications Commission (FCC) to increase wifi in classrooms, a yearly $1.5 billion increase in the FCC’s E-Rate program, and an additional $2 billion in private sector contributions.
As a result, over the next two years, we will go from having roughly 30% of schools connected to wifi in the classroom to having nearly all students in classrooms with high-speed wifi. This is a monumental step forward and has the potential to be one of the most transformative moments in American education. This new infrastructure has the potential to bring amazing real-world learning experiences to the classroom. It has the potential to close long-standing equity gaps that other approaches haven’t been able to address. It has the ability to personalize learning for all students and engage parents along the way. But our ability to realize this potential depends largely on the availability of effective apps that support this transformation. Over the next two years educators and parents will be making a huge number of decisions about which apps to use with kids. They need to make good decisions based on evidence, as opposed to relying on marketing hype or the buzz among a small group of peers, is critical.
And let’s be clear, this is bigger than just knowing whether apps improve student academic performance. Many apps claim to reduce teacher time spent on administrative tasks, for examples, or increase parent engagement, or encourage collaboration among students. These are equally important data points that parents and educators alike should know when choosing which apps to present to their students.
What are we doing about it?
Last week, the U.S. Department of Education announced a Request for Proposal (RFP) for Rapid-Cycle Technology Evaluations. We’re looking for innovative approaches to evaluating educational apps to assist schools and parents make evidence-based decisions when choosing which apps to use with their students.
The project is also intended to design evaluation tools and training materials to support the field in conducting rapid cycle technology evaluations. Evaluation tools may include templates for use in establishing clear expectations for all participants, protocols for best practices, applications (for developers or educators) to participate in study, surveys, checklists, or quality assurance materials. Training materials may include resources for pre-, during and post-study such as self-assessments for participating educators (to indicate readiness for study), technical training, resources for developers on working with schools, and how to interpret study results. While the evaluation of a specific tool is the focus of this work, building capacity among participants is an important expected outcome.
The product evaluations supported by this contract are meant to demonstrate whether certain types of studies — for examples, studies that look at effects on outcomes but do not try to explain the mechanism by which any effect occurred, and/or studies that use administrative data — can be conducted rapidly enough to meet the need of educators for information about effectiveness of technology in this fast-changing landscape. All of these factors are increasing the need to identify what’s working and what’s not more efficiently and more effectively.
This project will establish a standard for low-cost, quick turnaround evaluations of apps, and field test rapid-cycle evaluations. In addition to generating evidence on specific apps, the project will help develop protocols for conducting rapid cycle evaluations of apps that practitioners, developers, and researchers can use beyond the scope of this evaluation.
This work follows on the guide released on 2013, Expanding Evidence, which calls for smart change by presenting educators, policymakers, and funders with an expanded view of evidence approaches and sources of data that can help them with decision-making about learning resources.
To learn more, or to submit a proposal, go here.
We need to help schools and families make the best use of their resources — both time and money. School and family budgets aren’t likely to increase significantly and the total hours of the day remain the same. Technology has the power to support the transformation of teaching and learning, but only when we know what works. By employing rapid-cycle approaches to evaluating the effectiveness of educational apps, we can make choices about which apps we use based on evidence, not hype.