From the Field

Research Notes: Teachers Assigned to High-Achieving, Well-Behaved Classrooms Receive Better Performance Ratings

Classroom observations are often a significant component of teacher evaluations. But a new study by William Delgado of the University of Chicago and Lauren Sartain of the University of North Carolina at Chapel Hill argues that observer bias may conflate classroom composition with teacher quality, potentially disadvantaging educators assigned to the most challenging classrooms. The authors find that teachers receive higher ratings when they teach students with fewer behavioral issues and better grades and test scores.

Using administrative data from the Chicago Public Schools (CPS) system during the 2012-13 to 2016-17 school years, the authors examine the causal impact of classroom characteristics on teacher evaluations. They match teachers in grades 3 through 8 to their annual classroom observation scores and student rosters, incorporating demographic information on gender, race/ethnicity, free or reduced-price lunch status, and special education status. Then, measures such as test scores, GPA, attendance, suspensions, and grade repetition are aggregated into a classroom quality index (CQI), where higher values reflect higher-performing, better-behaved classrooms.

The analysis finds that teachers with higher-CQI classrooms consistently receive higher observation scores, even after accounting for teacher and school characteristics. Larger class sizes are associated with slightly lower ratings, though the effect is modest. Student demographics, meanwhile, are not statistically significant.

The authors also simulate a policy experiment where they adjust teacher performance ratings for classroom composition based on CQI scores. Black educators benefit most, gaining 8 percentage points on their observation score and becoming 16 percentage points more likely to remain in the top 5 percent of teachers. Hispanic and novice teachers also see gains, but smaller.

Given the widespread use of classroom observations, Delgado and Sartain argue that teacher evaluations that fail to account for classroom composition could systematically disadvantage certain teachers, particularly Black educators who are more likely to teach higher-need classrooms. Mitigating potential observer bias, they argue, “can improve fairness, particularly for teachers serving historically underserved students.”

Classroom Composition Affects Teacher Performance Ratings

William Delgado & Lauren Sartain
February 2026