When Education Secretary Betsy DeVos famously stumbled in her confirmation hearing over a question about measuring student growth versus proficiency, the secretary of education nominee shined a spotlight on one of the more significant questions in education policy today.
For the past dozen years—the No Child Left Behind era—the primary metric for assessing school performance in most states and in federal policy has been the proficiency rate. That is, the proportion of a school’s students scoring above a state-determined proficiency threshold in math and English language arts, and whether this proportion met state targets.
From the very earliest days of the law’s enactment, researchers questioned whether proficiency rates were really the best measures of school performance.
Some argued that proficiency rates essentially measure who is enrolled in a school, rather than how well the school is doing at educating them. Because such status measures merely capture the current performance levels of students, proficiency rates are highly correlated with student socioeconomic status and other demographics. Growth-based measures, on the other hand can show students’ year-to-year changes and better demonstrate the school’s effectiveness or contribution to student learning.
Others pointed out that even among status measures of performance, proficiency is an especially poor one. It creates an incentive to focus efforts primarily on students very near the threshold, as very low and very high achievers are unlikely to change from proficient to not (or vice versa). It is of course based heavily on where the state decides to set proficiency, and states have varied tremendously in that decision.
It also throws away a great deal of information—a student one point above the proficiency threshold looks exactly the same as a student one hundred points above the threshold. Research suggests that this flaw led to a focus on “bubble kids” near the proficiency threshold at the expense of high and low achievers (though I am not aware of research showing there were differential achievement impacts of NCLB accountability based on prior achievement).
Despite these concerns, proficiency rates remained the dominant measure of school performance under NCLB.
The passage of the Every Student Succeeds Act (ESSA) a year ago seemed to offer some relief from the tyranny of proficiency, but the language was not especially clear. The law required a status measure of student performance, but it was not spelled out whether proficiency rates were required or whether states could pick a different metric. Even in the draft regulations that were meant to clarify the law, this point was fuzzy.
Owing to this ambiguity, I penned a letter to the Department of Education during the comment period on draft regulations arguing that they should broadly interpret the ESSA statute to allow states to use status measures of performance other than percent proficient. In particular, I recommended allowing states to use average scale scores (i.e., the simple average of students’ test scores in a school) as their status measure because this made better use of the available data.
Failing that, I recommended a performance index that gives schools credit for performance all along the achievement distribution. For instance, the proficiency rate is calculated by assigning each proficient student a value of 1 and each non-proficient student a value of 0 and then taking the average score across students.
A performance index might instead give each advanced student a score of 1.1, each proficient student a score of 1, each basic student a score of .7, each below basic student a score of .3, and each far below basic student a score of zero. Again, the average score across students would be the school’s performance level. This letter was endorsed by more than 100 researchers, policymakers, and educators.
On this point, we earned a partial victory. Specifically, the department’s regulations allow performance indexes, but not average scale scores, to become the primary status metric of accountability. The only caveat appears to be that these performance indices are allowed:
“so long as (1) a school receives less credit for the performance of a student that is not yet proficient than for the performance of a student at or above the proficient level; and (2) the credit a school receives for the performance of a more advanced student does not fully compensate for the performance of a student who is not yet proficient.”
This decision is not perfect, but it is considerably better than requiring proficiency rates alone. All states should take advantage of this flexibility, because it substantially reduces the design flaws associated with proficiency rates.
Furthermore, as I read this requirement, states can in fact get their performance indices very close to average scale scores if they simply create many score categories. For instance, rather than just putting students in four categories, why not put them in 10, 20, or 100? As long as the score boost from the above-proficient students does not outweigh the below-proficient ones, I see no reason why the Department wouldn’t approve it.
Beyond the issue of measuring performance levels, the law and regulations offer states considerable discretion as to how much they weight growth versus status in their systems. I want to exhort states to put as much weight on growth as possible, because only that can come close to measuring the true contributions of schools to student success. Growth-based measures will better target the schools that are helping students to learn and those that need support or intervention.
Together, these two changes—measuring performance levels using a performance index with as many categories as possible and weighting growth more heavily than status—will go a considerable way toward improving the design of accountability policy and reducing unintended consequences.
Morgan Polikoff is an Associate Professor of Education at the USC Rossier School of Education and a FutureEd senior fellow.