Exogenous Variables and Value-Added Assessments: A Fatal Flaw
by David C. Berliner - 2014
Background: There has been rapid growth in value-added assessment of teachers to meet the widely supported policy goal of identifying the most effective and the most ineffective teachers in a school system. The former group is to be rewarded while the latter group is to be helped or fired for their poor performance. But, value-added approaches to teacher evaluation have many problems. Chief among them is the commonly found class-to-class and year-to-year unreliability in the scores obtained. Teacher value-added scores appear to be highly unstable across two classes of the same subject that they teach in the same semester, or from class to class across two adjacent years.
Focus of Study: This literature review first focuses on the confusion in the minds of the public and politicians between teachers’ effects on individual students, which may be great and usually positive, and teachers’ effects on classroom mean achievement scores, which may be limited by the huge number of exogenous variables affecting classroom achievement scores. Exogenous variables are unaccounted for influences on the data, such as peer classroom effects, school compositional effects, and characteristics of the neighborhoods in which some students live. Further, even if some of these variables are measured, the interactions among these many variables often go unexamined. But, two-way and three-way interactions are quite likely to be occurring and influencing classroom achievement. This analysis promotes the idea that the ubiquitous and powerful effects on value-added scores of these myriad exogenous variables is the reason that almost all current research finds instability in teachers’ classroom behavior and instability in teachers’ value-added scores. This may pose a fatal flaw in implementing value-added assessments of teaching competency.
Research Design: This is an analytic essay, including a selective literature review that includes some secondary analyses.
Conclusions: I conclude that because of the effects of countless exogenous variables on student classroom achievement, value-added assessments do not now and may never be stable enough from class to class or year to year to be used in evaluating teachers. The hope is that with three or more years of value-added data, the identification of extremely good and bad teachers might be possible; but, that goal is not assured, and empirical results suggest that it really is quite hard to reliably identify extremely good and extremely bad groups of teachers. In fact, when picking extremes among teachers, both luck and regression to the mean will combine with the interactions of many variables to produce instability in the value-added scores that are obtained. Examination of the apparently simple policy goal of identifying the best and worst teachers in a school system reveals a morally problematic and psychometrically inadequate base for those policies. In fact, the belief that there are thousands of consistently inadequate teachers may be like the search for welfare queens and disability scam artists—more sensationalism than it is reality.
To view the full-text for this article you must be signed-in with the appropriate membership. Please review your options below: