In the often-fraught debate over education policy, there is growing agreement that educators should pay close attention to the development of the social and emotional skills that allow students to persevere when working on difficult tasks, regulate emotions, and work effectively in teams.
But measuring such skills remains a significant challenge. In November, RAND released a web-based tool to help practitioners and researchers identify assessments of social-emotional learning. The RAND Education Assessment Finder provides information about roughly 200 measures of K-12 students’ interpersonal (social) and intra-personal (emotional) competencies, as well as higher-order cognitive competencies such as creativity. Practitioners and researchers can explore what assessments are available, what they are designed to measure, what demands they place on students and teachers, and what kinds of uses their scores support.
At about the same time, the Assessment Work Group convened by the Collaborative for Academic, Social, and Emotional Learning released a new social-emotional learning Assessment Guide, which provides a catalogue of some 20 popular SEL assessments, along with guidance for using them effectively.
But the tools have limits, and these assessments are not yet suitable for high-stakes uses in accountability systems. Even under lower-stakes conditions, the simplistic use of such assessments poses risks that researchers and practitioners should avoid.
First, the set of SEL assessments is limited. Although the RAND Assessment Finder contains more than 200 measures, the number of high-quality measures with solid evidence of reliability and validity is small. A careful review shows that there is limited availability of assessments for certain skills and grade levels, which could lead users to select weak measures.
There are fewer measures exclusively for younger students than for older students, and fewer measures of interpersonal competencies such as teamwork and social awareness than intra-personal competencies such as self-regulation or grit. Further, RAND’s Finder contains relatively few measures that directly assess student skills and competencies. Instead, most assessments rely on student self-reports or teacher judgments via surveys.
Users should be cautious when interpreting self-reports and teacher judgments because they can be influenced by a desire to respond in ways they believe reflect social norms, a problem testing experts call "social desirability bias." Self-reports can also suffer from “reference group bias,” which can occur when students' responses are shaped by the characteristics of the other students with whom they interact or by standards of the schools they attend.
We also need to recognize that students develop SEL competencies not only through their experiences in school, but also in their homes, neighborhoods, and other environments. This presents the risk of attributing poor social-emotional performances to students' weaknesses, rather than to contextual conditions that may have hindered the development of these skills. It could encourage educators to blame the student rather than addressing conditions in their school or home environments.
As a result, decisions about how to adjust instruction or programs in response to social-emotional assessment information should not rely exclusively on skills assessments; they should also be informed by data on students’ school and out-of-school environments. The practitioner guidance that RAND developed jointly with the Assessment Work Group provides suggestions for how to incorporate data from multiple sources, including scores on surveys of school climate or information on behavioral incidents.
There are risks for researchers, as well, and they should be careful to not rely exclusively on social-emotional measures to assess the effects of programs or policies. At present, few of the examples included in the assessment finder have been validated for measuring growth in the competency tested.
Researchers need to guard against drawing conclusions about how social-emotional learning programs contribute to student growth based on measures that have not been shown to be sensitive enough to measure such growth. In addition, measures that have been developed as part of a social-emotional learning curriculum may be too closely aligned with that curriculum to give accurate results about broad program effects. Differences in scores between participants and nonparticipants could reflect exposure to the language used in a curriculum rather than the development of generalizable skills. Using curriculum-aligned measures could therefore overstate the program’s effects.
Researchers at Stanford University-based Policy Analysis for California Education (PACE) are tackling these tough issues with the CORE Districts, a consortium of urban California school systems that have been administering annual surveys on school climate and social and emotional development to nearly a million students for the past several years.
PACE researchers have gathered evidence that the CORE Districts surveys produce scores with high reliability and that survey results vary across schools. They have demonstrated that the results are correlated with students’ academic achievement and that they explain student achievement above and beyond non-academic measures that schools already track, including chronic absenteeism. The next step is for PACE to explore how effectively the CORE districts have responded to the survey results.
Tools like the RAND Education Assessment Finder and the Assessment Guide should help. But there’s much research and development work to do to make surveys and other measures of school climate and social-emotional development dependable tools of school improvement.
Laura S. Hamilton is a senior behavioral scientist at the nonprofit, nonpartisan RAND Corporation, where Brian M. Stecher, a FutureEd research advisory board member, is an adjunct senior social scientist.