Education Teacher Evaluation and Teacher Effectiveness
James H. Stronge, Leslie W. Grant, Xianxuan Xu
  • LAST REVIEWED: 28 April 2017
  • LAST MODIFIED: 29 October 2013
  • DOI: 10.1093/obo/9780199756810-0138


Teacher evaluation has evolved over time from focusing on the moral values of a teacher in the early 1900s to standards-based evaluation models of today that seek to include measures of student academic progress. Often, teacher evaluation systems seek to serve two needs: accountability and improvement. Changes in teacher evaluation have been influenced by political winds as well as a desire to create systems that are fair and balanced. This article provides an overview of the evolution of teacher evaluation as well as a focus on measuring teacher effectiveness and including such measures as part of an overall evaluation system. Earlier evaluation models began to connect evaluation to the roles and responsibilities of the teacher with a move toward developing standards that delineate teacher performance expectations. These expectations would then be used as a basis for the design and implementation of the evaluation system. In the current standards-based accountability, the focus has shifted from an evaluation system that measures the processes of teaching to an evaluation system that measures both the processes of teaching and student outcomes. The United States is the avant-garde initiator and practitioner of this trend of reform. Including student outcomes has been the topic of intense discussion as policymakers and researchers debate the validity of the use of student test scores in terms of value-added modeling and other growth models. Researchers do not agree on the stability of such models and whether they do differentiate between effective and less effective teachers. Although the debate continues, implementation of such systems in many states within the United States has begun. Similarly, countries across the globe struggle with the basis for teacher evaluation and how teacher effectiveness research impacts such processes. Research will continue to inform and enrich this debate and discussion.

National Reports

Seminal national reports are often cited in the literature related to teacher effectiveness and teacher evaluation. The reports in this section have had state and national implications in policy and practice. Each report focuses on the need for strengthening teacher quality and teacher evaluation systems. Darling-Hammond and Prince 2007 and Hinchey 2010 focus on the importance of strengthening teacher quality through well-constructed teacher evaluation systems. Both caution against the use of a single standardized test score as a significant component of a teacher’s evaluation and stress the need for multiple measures. The lack of differentiation in evaluation systems between effective and less effective teachers is exposed in Weisberg, et al. 2009 while Sartain, et al. 2011 provides support that a well-designed and implemented system can differentiate between teachers and provides a valid measure of teaching effectiveness as correlated with student academic progress. Likewise, the Measures of Effective Teaching Project, a project funded by the Bill & Melinda Gates Foundation, provides evidence that the use of a variety of sources such as student surveys, classroom observations, and student academic achievement gains can result in valid and reliable measures of teaching. Finally, Heneman, et al. 2006 provides results from a research study in which pay is linked to teacher evaluation. Again, the evaluation system is validated in that teacher ratings were positively correlated with student achievement. However, linking the evaluation with pay was received with mixed results. These national reports provide an overview of the national conversation around teaching effectiveness and teacher evaluation.

  • Darling-Hammond, Linda, and Cynthia D. Prince. 2007. Strengthening teacher quality in high-need schools: Policy and practice. Washington, DC: Council of Chief State School Officers.

    A report designed to inform policy and practice related to determining teachers’ effectiveness and thus evaluating teachers based on effectiveness models. In addition, guidance is offered in improving teacher quality for teachers in math and science and teachers of diverse learners and how to retain quality teachers in high-needs schools through the development of strong building-level leaders.

  • Heneman, Herbert G., III, Anthony Milanowski, Steven M. Kimball, and Allan Odden. 2006. Standards-based teacher evaluation as a foundation for knowledge- and skill-based pay. Philadelphia: Consortium for Policy Research in Education.

    A research study examining the design and effectiveness of standards-based evaluation systems and the use of these systems in knowledge- and skills-based pay. Findings suggest a positive correlation between teacher ratings and student achievement, mixed teacher and administrator reactions to the evaluation system, some positive impact on teaching practices, and factors contributing to and limiting implementation of the system.

  • Hinchey, Patricia H. 2010. Getting teacher assessment right: What policy makers can learn from research. Boulder, CO: National Education Policy Center.

    A policy report examining the research related to designing and implementing an evaluation system that can serve dual purposes of formative and summative evaluation. Recommendations for clear evaluation criteria, training of evaluators, and inclusion of student assessment data are provided.

  • Measures of Effective Teaching Project. 2013. Ensuring fair and reliable measures of effective teaching: Culminating findings from the MET Project’s three-year study. Phoenix, AZ: Bill & Melinda Gates Foundation.

    A three-year study that examined the validity and reliability of a teacher evaluation system that used multiple data sources, including student surveys, classroom observations, and student academic achievement gains. Findings indicate that these multiple data sources together can distinguish between more effective and less effective teachers.

  • Sartain, Lauren, Sara Ray Stoelinga, Eric Brown, et al. 2011. Rethinking teacher evaluation in Chicago: Lessons learned from classroom observations, principal-teacher conferences, and district implementation. Chicago: Consortium on Chicago School Research.

    A report on the validation of a standards-based evaluation system in Chicago Public Schools with implications for districts and states implementing standards-based evaluation systems. Findings suggest observation ratings correlate with student academic growth, consistency in observation ratings across raters, beneficial conferencing, and high engagement in evaluation system.

  • Weisberg, Daniel, Susan Sexton, Jennifer Mulhern, and David Keeling. 2009. The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. Brooklyn, NY: New Teacher Project.

    A study undertaken to determine the extent to which variations in teacher performance is documented through teacher evaluations. Findings suggest that little variation in rating teachers exists, effective teachers go unrecognized while less effecting teaching is not addressed. Recommendations for an evaluation system that differentiates among teachers are provided.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.