We are an independent, solution-oriented think tank at Georgetown University's McCourt School of Public Policy.Learn More.

Parsing the Evidence Requirements of Federal Covid Aid

School districts are grappling with how to spend the latest and largest installment of federal relief for public schools, the American Recovery Plan’s $123 billion Elementary and Secondary Schools Emergency Relief (ARP ESSER) fund. The law requires that educators a significant portion of the money on things that have been found through research to make a difference for students. In reality, federal policymakers have given the education sector wide latitude in the evidence they bring to bear in spending Covid aid.

Districts, which receive the bulk of ESSER funds, must spend 20 percent of their money to address learning loss through the implementation of “evidence-based” interventions. States must also reserve funds for evidence-based interventions to address learning loss (5 percent of the total the state receives), to support evidence-based summer enrichment programs (1 percent of the total), and to support evidence-based comprehensive after-school programs (1 percent of the total).

What does evidence-based mean? The ARP uses the same definition as the federal Every Student Succeeds Act (ESSA), which establishes four tiers of evidence and requires a subset of struggling schools to use the top three levels.

Unlike ESSA, though, the Covid relief dollars can be spent on interventions with any of the four tiers of evidence, including the least demanding. That includes spending options that don’t appear in evidence lists or research clearinghouses. In its latest guidance, released in May, the U.S. Department of Education again clarified that any of the four tiers “counts.”

The most permissive option is for evidence that “demonstrates a rationale.” That’s because you don’t need a study of the specific activity, strategy, or intervention to demonstrate a rationale for its effectiveness.

In contrast, the more restrictive definitions—for strong, moderate, and promising evidence—all require a statistically significant effect on outcomes based on at least one study.

[Read More: What Congressional Covid Funding Means for K-12 Education]

Statistically significant effects may not necessarily be large enough to be educationally meaningful, let alone cost-effective. But they are highly unlikely to be generated by chance. If a study finds a statistically significant result, its authors will be sure to make that clear, using those exact words. Such a study doesn’t need to be published in a peer-reviewed journal. And, for better or worse, only one study needs to find a statistically significant effect—even if many others find a different result. The differences among the top three evidentiary tiers are in study design. Let’s say you find a study on a reading intervention and want to know if you can count that intervention as evidence-based in your ESSER planning. Here’s how you could clear the bar using the first three evidence levels:

  • Only a randomized control trial, where some students or schools are randomly assigned to receive the interventions and similar groups are not, allows a study to be counted as strong evidence. If you don’t see the word “random,” the evidence isn’t strong.
  • To get to the moderate level, the study would need a “quasi-experimental” design. This means you can tell a convincing story about why one group got a reading intervention while another highly similar group did not. For example, if a private foundation paid for staff in one district to be trained in the intervention in 2018, but not in another district serving a similar population, the relative changes between the two districts before and after the training would allow for a quasi-experimental study design.
  • To get to the promising level, the study would need to “control” for something related to differences in the two groups, speaking to why some students got the intervention and others didn’t. A study that controls for students’ test scores prior to the reading intervention could fit the bill, for example. The reason this wouldn’t count as moderate is because it doesn’t have a clear story for why, among a group of students with identical prior scores, some received the reading intervention and others didn’t.

It’s pretty straightforward to determine if evidence is “strong”: Does it assign schools or students randomly to the intervention? While statisticians might quibble about whether a given study meets moderate versus promising levels of evidence, in practice, it doesn’t matter, as neither ESSA nor ARP ESSER treat the two tiers of interventions differently.

If you can’t find a study, perhaps because you are developing something new, or aren’t sure if a study meets these definitions, consider the most flexible option: demonstrating a rationale for the intervention.

What constitutes a rationale? The new guidance describes it like this:

  • Explain the reasoning for how the intervention, if successful, would improve outcomes. For example, if the reading intervention focuses on teaching common sound-spelling patterns, you could express that reasoning in very few words: understanding common sound-spelling patterns helps students learn to read. This would make sense if you are targeting reading-related learning loss or want to improve reading more generally. The Regional Educational Laboratory Pacific offers resources for making a logic model, which is a more structured option for explaining the reasoning behind your choice of intervention. If you use another planning model in ongoing district work, such as a theory of action, you can use that to meet this requirement.
  • The reasoning must be “based on high-quality research findings or positive evaluation.” High-quality research findings need not meet the strong, moderate, or promising levels described above: they could come from other types of research, whose methodologies don’t map to these definitions, or reflect a consensus view from a body of high-quality descriptive research. These findings, importantly, can relate to components of an intervention (like that instruction in common sound-spelling patterns helps students learn to read), and need not speak to a specific, often branded, intervention that incorporates the component. Reputable sources for accessible versions of such research findings include the federally-sponsored What Works Clearinghouse (WWC) Practice Guides, and academic sources like the Annenberg Institute’s Ed Research for Recovery, the Campbell Collaboration, the Evidence Project at the Center for Reinventing Public Education, and FutureEd. You could point to this WWC guide for research supporting your reasoning about the reading intervention.
  • Finally, you must engage in “ongoing efforts to examine the effects of such activity, strategy, or intervention.” While this commitment may seem like an extra step, it is actually a benefit: it will help determine whether the intervention is working well, needs tweaking, or should go. It’s also the only way to build an evidence base on new approaches. To satisfy this requirement, you could commit to comparing reading scores for students participating in the intervention with groups of similar students, perhaps in different schools or cohorts, who didn’t have that option.

That’s it, and it counts as ARP ESSER evidence for all schools. Even without a published study.

Education leaders should embrace the flexibility in ARP ESSER to make smart local choices about what to implement, driven by local needs and values, and not by misperceptions of which interventions count as evidence-based.

[Read More: Getting to Yes on Covid Relief Spending]

Nora Gordon is an associate professor at Georgetown University's McCourt School of Public Policy and a member of the FutureEd Advisory Board. She and Carrie Conaway are the authors of Common-Sense Evidence: The Education Leader’s Guide to Using Data and Research.

Photo courtesy of Allison Shelley/The Verbatim Agency for American Education: Images of Teachers and Students in Action.