Home Articles Reader Opinion Editorial Book Reviews Discussion Writers Guide About TCRecord
transparent 13

Getting Teacher Evaluation Right: What Really Matters for Effectiveness and Improvement

reviewed by Morgan S. Polikoff & Shauna Campbell - July 02, 2014

coverTitle: Getting Teacher Evaluation Right: What Really Matters for Effectiveness and Improvement
Author(s): Linda Darling-Hammond
Publisher: Teachers College Press, New York
ISBN: 0807754463, Pages: 192, Year: 2013
Search for book at Amazon.com

In Getting Teacher Evaluation Right: What Really Matters for Effectiveness and Improvement, Linda Darling-Hammond calls for the adoption of rigorous and holistic teacher evaluation processes. Her argument is grounded in the need for comprehensive, feasible evaluation systems to benefit both teachers and students. She believes that effective teacher evaluation can bring accountability and professionalization to teaching. She sees the importance of a comprehensive system that develops beginning teachers and supports veterans, focusing on practices that reinforce community rather than reduce evaluation to rankings. She cites survey data showing that teachers find evaluation practices to be ineffective and the tenure process to be meaningless due to low dismissal rates. Darling-Hammond writes that increased rigor in teacher evaluation practices can legitimize the profession in a manner similar to law and medicine.

Darling-Hammond’s proposed teacher evaluation system contains five key elements. The first is the adoption of common statewide teaching standards that are aligned with student learning objectives. The second is the performance-based and holistic nature of the evaluations to guide licensure and certification. The third is the alignment of local evaluation practices with the common statewide standards. The fourth is the creation of systems of support for teachers who need assistance. The fifth is the implementation of professional development that reinforces the goals of the evaluation system.

Darling-Hammond recognizes that the successful implementation of an evaluation protocol will depend upon a supportive infrastructure. This means training for evaluators and principals; collaboration among unions, school boards, and other actors; and investment in human resources to develop and sustain evaluation systems. Some of the systemic conditions necessary for success are clear standards for licensure and evaluation, a continuum of performance assessments beginning with initial licensure and continuing throughout a teacher’s career, multiple measures of evaluation including observation of practice and student learning outcomes, and professional development.

The book has many strengths, including a well-reasoned and well-supported main argument for a comprehensive, systemic approach to teacher evaluation. Darling-Hammond is skilled at presenting compelling evidence in support of her claims, so readers may find little with which to disagree. Darling-Hammond’s advocacy for teachers and students is evident in her writing. She promotes a system meant to capitalize on the skill and talent of educators, and she recognizes teachers as valuable members of the education system. She argues for high expectations and rigorous evaluation coupled with reflection, education, and improvement to practice.

Perhaps the most controversial section of the book is the critique of value added models (VAMs) for teacher evaluation, to which Darling-Hammond devotes a whole chapter. She presents carefully-chosen research arguing against VAMs on the grounds that they are too unstable and biased. She relies heavily on anecdotes of teachers rated effective in one year and not the next, or of teachers who teach challenging classes and subsequently receive low VAM scores. This chapter is compelling and will provide useful fodder for VAM opponents, but it does not present the full spectrum of VAM research (for example, some experimental research finds VAMs are not biased by student and teacher sorting (e.g., Kane, McCaffrey, Miller, & Staiger, 2013).

While she never clearly states that VAMs should not be used, that seems to be her position given the extensive list of criteria she offers for judging the potential utility of various measures. Instead of VAM, Darling-Hammond recommends assessing teacher contributions to student learning based on multiple measures of student learning outcomes, including portfolios, subject-specific pre- and post-tests, qualitative analyses, team learning goals, and student learning objectives. These recommendations may well contribute to a holistic, equitable evaluation process, but we wonder about the quality of these alternative measures and the feasibility with which they can be implemented. While Darling-Hammond presents a scathing critique of value-added models, she does not offer strong evidence that her proposed alternative student learning measures can overcome the technical limitations of VAMs. We suspect this is because the evidence simply does not yet exist that they can (Gill, Bruch, & Booker, 2013). Furthermore, she neglects to note that instability and bias may characterize observational and student survey measures of teacher effectiveness, as well (Polikoff, 2013; Whitehurst, Chingos, & Lindquist, 2014).

Though we have few major criticisms of the work, we are left wondering about its intended audience. The practical nature of the recommendations leads us to believe the book is meant for district superintendents or policymakers. Darling-Hammond provides numerous vignettes describing effective implementation of teacher evaluation protocols throughout the country and internationally, though we sometimes wished she had connected these vignettes to subsequent impacts on student performance. The insights offered with these vignettes could be useful for district officials, but the sheer number of vignettes might make it difficult for superintendents or policymakers to choose or enact one. With so many exemplary systems, administrators may find it challenging to find the one that is right for their school. Darling-Hammond might instead have focused on one ideal model with clear implementation guidelines and suggestions for local adaptations. This might have made it easier for administrators to directly apply the book’s lessons to their own sites.

Another concern is that Darling-Hammond seems to downplay or overlook the massive changes in policy and practice that would be required to implement systems like those she proposes. For one, the systems she advocates would require a near-complete reconceptualization of the role of the principal, necessitating school leaders to develop deep expertise in instruction and spend substantial proportions of their time observing and providing feedback to teachers. Perhaps this would be a welcome change, but the pluses and minuses of such a radical shift at least merit discussion. For another, Darling-Hammond barely mentions important policy issues such as (a) At what level (federal, state, district, school) should evaluation policies be designed? (b) How should evaluations inform tenure and continuing teacher employment decisions? And (c) If the technical properties of VAMs make them inadequate to be used, what kinds of thresholds should policymakers consider in choosing measures? Finally, several of the criteria presented in the book, both for student learning measures and observational measures, are likely so challenging that they’re effectively impossible to meet. Should administrators let the perfect be the enemy of the good when it comes to evaluation?

Ultimately, Darling-Hammond effectively argues for the need to overhaul haphazard and effectively useless teacher evaluation systems. The systems Darling-Hammond proposes would greatly improve many existing methods of teacher evaluation. The examples presented in the text offer useful descriptions of successful programs. However, the challenges of designing and implementing such systems may end up being insurmountable for some districts.


Gill, B., Bruch, J., & Booker, K. (2013). Using alternative student growth measures for evaluating teacher performance: What the literature says. Washington, DC: US Department of Education.

Kane, T. J., McCaffrey, D. F., Miller, T., & Staiger, D. O. (2013). Have we identified effective teachers? Validating measures of effective teaching using random assignment. Seattle, WA: Bill and Melinda Gates Foundation.

Polikoff, M. S. (2013). The stability of observational and student survey measures of teaching effectiveness. Paper presented at the 2013 Annual Conference of the Association for Education Finance and Policy, New Orleans, LA.

Whitehurst, G. J., Chingos, M. M. & Lindquist, K. M. (2014). Evaluating teachers with classroom observations: Lessons learned in four districts. Washington, DC: Brookings Institution.

Cite This Article as: Teachers College Record, Date Published: July 02, 2014
https://www.tcrecord.org ID Number: 17585, Date Accessed: 10/22/2021 10:32:20 PM

Purchase Reprint Rights for this article or review
Article Tools
Related Articles

Related Discussion
Post a Comment | Read All

About the Author
  • Morgan Polikoff
    University of Southern California Rossier School of Education
    E-mail Author
    MORGAN POLIKOFF is an assistant professor at the University of Southern California Rossier School of Education. His research focuses on the design, implementation, and effects of standards, assessment, and accountability policies.
  • Shauna Campbell
    University of Southern California Rossier School of Education
    E-mail Author
    Shauna Campbell is a Provost Fellow in the Urban Education Policy PhD program at the University of Southern California's Rossier School of Education. Her research interests include gifted education and accountability
Member Center
In Print
This Month's Issue