Methodological Approaches for Impact Evaluation in Educational Settings
- LAST REVIEWED: 26 February 2020
- LAST MODIFIED: 26 February 2020
- DOI: 10.1093/obo/9780199756810-0244
- LAST REVIEWED: 26 February 2020
- LAST MODIFIED: 26 February 2020
- DOI: 10.1093/obo/9780199756810-0244
Introduction
Since the start of the War on Poverty in the 1960s, social scientists have developed and refined experimental and quasi-experimental methods for evaluating and understanding the ways in which public policies, programs, and interventions affect people’s lives. The overarching mission of many social scientists is to understand “what works” in education and social policy. These are causal questions about whether an intervention, practice, program, or policy affects some outcome of interest. Although causal questions are not the only relevant questions in program evaluation, they are assumed by many in the fields of public health, economics, social policy, and now education to be the scientific foundation for evidence-based decision making. Fortunately, over the last half-century, two methodological advances have improved the rigor of social science approaches for making causal inferences. The first was acknowledging the primacy of research designs over statistical adjustment procedures. Donald Campbell and colleagues showed how research designs could be used to address many plausible threats to validity. The second methodological advancement was the use of potential outcomes to specify exact causal quantities of interest. This allowed researchers to think systematically about research design assumptions and to develop diagnostic measures for assessing when these assumptions are met. This article reviews important statistical methods for estimating the impact of interventions on outcomes in education settings, particularly programs that are implemented in field, rather than laboratory, settings. We begin by describing the causal inference challenge for evaluating program effects. Then four research designs are discussed that may be used for estimating program impacts. The article highlights what the Campbell tradition identifies as the strongest causal research designs: the randomized experiment and the regression-discontinuity designs. These approaches have the advantage of transparent assumptions for yielding causal effects. The article then discusses weaker but more commonly used approaches estimating effects, including the interrupted time series and the non-equivalent comparison group designs. For the interrupted time series design, differences-in-differences are discussed as a more generalized approach to time series methods; for non-equivalent comparison group designs, the article highlights propensity score matching as a method for creating statistically equivalent groups on the basis of observed covariates. For each research design, references are included that discuss the underlying theory and logic of the method, exemplars of the approach in field settings, and recent methodological extensions to the design. The article concludes with a discussion of practical considerations for evaluating interventions in field settings, including the external validity of estimated effects from impact studies.
General Overviews
The fundamental problem of causal inference is that we cannot observe both what happens to a student when they receive an intervention and what would have occurred in an alternate reality in which the same student did not receive an intervention. For example, researchers can observe what happens to children in a preschool program but cannot observe what would have happened to the same children had they not entered preschool. To study the causal effect of a program or intervention, one needs a counterfactual, or something that is contrary to fact. Given that researchers never observe the counterfactual, we look for approximations (e.g., older siblings, neighborhood children, children in a nationally representative survey, or randomly assigned control children not exposed to the treatment). The Rubin Causal Model introduced in Rubin 1974 formalizes this reasoning mathematically. It is based on the idea that every unit has a potential outcome based on its “assignment” to a treatment or control condition. Using a potential outcomes framework, researchers are able to define a causal estimand of interest for a well-defined treatment and inference population, as well as assumptions required for a research design to yield a valid effect. Campbell and Stanley 1963 demonstrates how these assumptions may be violated in field settings through their list of “validity threats.” Cook and Campbell 1979 and Shadish et al. 2002 extend this idea by introducing four types of validity threats, including threats to internal, external, statistical conclusion, and construct validity. Angrist and Pischke 2009 provides an up-to-date overview of common methodological approaches from an econometric perspective and discusses estimation procedures for producing causal estimates. Angrist and Pischke 2015 offers a more approachable overview of the same material intended for an undergraduate audience. Imbens and Rubin 2015 and Morgan and Winship 2007 straddle the econometric and statistics literature and offer additional insights about causal inference from a potential outcomes perspective and a causal graph theory perspective, respectively. For an overview of key experimental and quasi-experimental designs specific to the field of education, see Murnane and Willett 2011 and Stuart 2007.
Angrist, J., and J.-S. Pischke. 2009. Mostly harmless econometrics: An empiricist’s companion. Princeton, NJ: Princeton Univ. Press.
This book is a reference on methods of causal inference using a potential outcomes framework. It covers randomized experiments, statistical matching, instrumental variables, difference-in-differences, and regression discontinuity. The book describes each design and its assumptions formally through a series of proofs and informally through applied examples. Though written for a graduate student audience, it is a useful resource for any evaluator with training in probability and statistics.
Angrist, J., and J. -S. Pischke. 2015. Mastering ’metrics: The path from cause to effect. Princeton, NJ: Princeton Univ. Press.
This book provides a more approachable and conversational companion to Angrist and Pischke 2009. While both books describe the same methods of causal inference (randomized control trials, statistical matching, instrumental variables, regression discontinuity, and differences-in-differences designs), this book focuses more on conceptual understanding than on formal proofs—though brief proofs are provided. The book is written as an introduction to causal inference for undergraduate economics students.
Campbell, D. T., and J. C. Stanley. 1963. Experimental and quasi-experimental design for research. Boston, MA: Houghton Mifflin.
This seminal book outlines the major threats to internal validity (Did the intervention cause the observed effect?) and external validity (To what population, settings, treatments, and outcomes can this effect be generalized?) and provides an overview of how design features can address these threats. While the book discusses quasi-experimental designs, it is best suited for an overview of conceptual challenges related to causal inference rather than for guidance in statistical methods in estimating effects.
Cook, T. D., and D. T. Campbell. 1979. Quasi-experimentation: Design and analysis issues for field settings. Boston, MA: Houghton Mifflin.
Similar to Campbell and Stanley 1963, the first chapters of this book introduce the challenge of causal inference and threats to validity. The book updates Campbell and Stanley 1963 by also addressing analytical approaches. Helpfully, the book concludes with a section outlining major obstacles to conducting randomized experiments and describing situations that are particularly conducive to experimental evaluation.
Imbens, G., and D. Rubin. 2015. Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge, UK: Cambridge Univ. Press.
This textbook provides a rigorous introduction to the potential outcomes framework. Because the book relies on formal mathematical derivations, it is most appropriate for those with a solid understanding of probability and statistics. The book discusses randomized experiments (including instrumental variables for non-compliance) and matching methods but does not provide an overview of quasi-experimental designs. Applied examples from education, social science, and biomedical science are used to illustrate concepts.
Morgan, S., and C. Winship. 2007. Counterfactuals and causal inference: Methods and principles for social research. Cambridge, UK: Cambridge Univ. Press.
This textbook discusses how to answer causal questions using observational data rather than data where researchers have the opportunity to manipulate the treatment assignment. The book discusses randomized experiments primarily as a starting point to further understanding on non-experimental research designs, but several concepts, including the potential outcomes framework, are explained in detail with the help of causal diagrams, structural models, and examples from the social sciences.
Murnane, R., and J. Willett. 2011. Methods matter: Improving causal inference in educational and social science research. New York: Oxford Univ. Press.
This book is a broadly accessible reference to causal inference in education research. It illustrates important concepts in the design and analysis of randomized experiments, quasi-experiments (including the difference-in-difference, regression discontinuity, and instrumental variables approaches), and observational studies. High-quality causal studies in the field of education are used to demonstrate and evaluate the decisions researchers make in the design and analysis of a study.
Rubin, D. B. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66.5: 688–701.
DOI: 10.1037/h0037350
Provides the fundamental building blocks for modern program evaluation. Rubin conceptualizes the fundamental challenge of causal inference using a series of potential outcomes—individual outcomes in the presence of treatment and in the absence of treatment. This conceptualization allows for the formalization of both experimental and non-experimental design assumptions and is often referred to as the Rubin causal model.
Shadish, W. R., T. D. Cook, and D. T. Campbell. 2002. Experimental and quasi-experimental designs for generalized causal inference. Boston, MA: Houghton Mifflin.
This book is a successor to Campbell and Stanley 1963 and Cook and Campbell 1979. Provides a comprehensive discussion of the design elements a researcher may include to improve internal validity and provides the conceptual theory for research design choices. The latter part of the book proposes a theoretical framework for generalized causal inference.
Stuart, E. A. 2007. Estimating causal effects using school-level data sets. Educational Researcher 36.4: 187–198.
Stuart provides a survey of evaluation approaches with school-level data, including randomized experiments, regression discontinuity, interrupted time series, and non-equivalent comparison group designs. The article provides an overview of the National Longitudinal School-Level State Assessment School Database (NLSLASD) and key considerations to keep in mind when using the NLSLASD or other school-level datasets to answer causal questions.
Users without a subscription are not able to see the full content on this page. Please subscribe or login.
How to Subscribe
Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.
Article
- Academic Achievement
- Academic Audit for Universities
- Academic Freedom and Tenure in the United States
- Action Research in Education
- Adjuncts in Higher Education in the United States
- Administrator Preparation
- Adolescence
- Advanced Placement and International Baccalaureate Courses
- Advocacy and Activism in Early Childhood
- African American Racial Identity and Learning
- Alaska Native Education
- Alternative Certification Programs for Educators
- Alternative Schools
- American Indian Education
- Animals in Environmental Education
- Art Education
- Artificial Intelligence and Learning
- Assessing School Leader Effectiveness
- Assessment, Behavioral
- Assessment, Educational
- Assessment in Early Childhood Education
- Assistive Technology
- Augmented Reality in Education
- Beginning-Teacher Induction
- Bilingual Education and Bilingualism
- Black Undergraduate Women: Critical Race and Gender Perspe...
- Black Women in Academia
- Blended Learning
- Bullying
- Case Study in Education Research
- Changing Professional and Academic Identities
- Character Education
- Children’s and Young Adult Literature
- Children's Beliefs about Intelligence
- Children's Rights in Early Childhood Education
- Citizenship Education
- Civic and Social Engagement of Higher Education
- Classroom Learning Environments: Assessing and Investigati...
- Classroom Management
- Coherent Instructional Systems at the School and School Sy...
- College Admissions in the United States
- College Athletics in the United States
- Community Relations
- Comparative Education
- Computer-Assisted Language Learning
- Computer-Based Testing
- Conceptualizing, Measuring, and Evaluating Improvement Net...
- Continuous Improvement and "High Leverage" Educational Pro...
- Counseling in Schools
- Creativity
- Critical Approaches to Gender in Higher Education
- Critical Perspectives on Educational Innovation and Improv...
- Critical Race Theory
- Crossborder and Transnational Higher Education
- Cross-National Research on Continuous Improvement
- Cross-Sector Research on Continuous Learning and Improveme...
- Cultural Diversity in Early Childhood Education
- Culturally Responsive Leadership
- Culturally Responsive Pedagogies
- Culturally Responsive Teacher Education in the United Stat...
- Curriculum Design
- Data Collection in Educational Research
- Data-driven Decision Making in the United States
- Deaf Education
- Desegregation and Integration
- Design Thinking and the Learning Sciences: Theoretical, Pr...
- Development, Moral
- Dialogic Pedagogy
- Digital Age Teacher, The
- Digital Citizenship
- Digital Divides
- Disabilities
- Distance Learning
- Distributed Leadership
- Doctoral Education and Training
- Early Childhood Education and Care (ECEC) in Denmark
- Early Childhood Education and Development in Mexico
- Early Childhood Education in Aotearoa New Zealand
- Early Childhood Education in Australia
- Early Childhood Education in China
- Early Childhood Education in Europe
- Early Childhood Education in Sub-Saharan Africa
- Early Childhood Education in Sweden
- Early Childhood Education Pedagogy
- Early Childhood Education Policy
- Early Childhood Education, The Arts in
- Early Childhood Mathematics
- Early Childhood Science
- Early Childhood Teacher Education
- Early Childhood Teachers in Aotearoa New Zealand
- Early Years Professionalism and Professionalization Polici...
- Economics of Education
- Education For Children with Autism
- Education for Sustainable Development
- Education Leadership, Empirical Perspectives in
- Education of Native Hawaiian Students
- Education Reform and School Change
- Educational Research Approaches: A Comparison
- Educational Statistics for Longitudinal Research
- Educator Partnerships with Parents and Families with a Foc...
- Emotional and Affective Issues in Environmental and Sustai...
- Emotional and Behavioral Disorders
- English as an International Language for Academic Publishi...
- Environmental and Science Education: Overlaps and Issues
- Environmental Education
- Environmental Education in Brazil
- Epistemic Beliefs
- Equity and Improvement: Engaging Communities in Educationa...
- Equity, Ethnicity, Diversity, and Excellence in Education
- Ethical Research with Young Children
- Ethics and Education
- Ethics of Teaching
- Ethnic Studies
- Evidence-Based Communication Assessment and Intervention
- Family and Community Partnerships in Education
- Family Day Care
- Federal Government Programs and Issues
- Feminization of Labor in Academia
- Finance, Education
- Financial Aid
- Formative Assessment
- Future-Focused Education
- Gender and Achievement
- Gender and Alternative Education
- Gender, Power and Politics in the Academy
- Gender-Based Violence on University Campuses
- Gifted Education
- Global Mindedness and Global Citizenship Education
- Global University Rankings
- Governance, Education
- Grounded Theory
- Growth of Effective Mental Health Services in Schools in t...
- Higher Education and Globalization
- Higher Education and the Developing World
- Higher Education Faculty Characteristics and Trends in the...
- Higher Education Finance
- Higher Education Governance
- Higher Education Graduate Outcomes and Destinations
- Higher Education in Africa
- Higher Education in China
- Higher Education in Latin America
- Higher Education in the United States, Historical Evolutio...
- Higher Education, International Issues in
- Higher Education Management
- Higher Education Policy
- Higher Education Research
- Higher Education Student Assessment
- High-stakes Testing
- History of Early Childhood Education in the United States
- History of Education in the United States
- History of Technology Integration in Education
- Homeschooling
- Inclusion in Early Childhood: Difference, Disability, and ...
- Inclusive Education
- Indigenous Education in a Global Context
- Indigenous Learning Environments
- Indigenous Students in Higher Education in the United Stat...
- Infant and Toddler Pedagogy
- Inservice Teacher Education
- Integrating Art across the Curriculum
- Intelligence
- Intensive Interventions for Children and Adolescents with ...
- International Perspectives on Academic Freedom
- Intersectionality and Education
- Knowledge Development in Early Childhood
- Leadership Development, Coaching and Feedback for
- Leadership in Early Childhood Education
- Leadership Training with an Emphasis on the United States
- Learning Analytics in Higher Education
- Learning Difficulties
- Learning, Lifelong
- Learning, Multimedia
- Learning Strategies
- Legal Matters and Education Law
- LGBT Youth in Schools
- Linguistic Diversity
- Linguistically Inclusive Pedagogy
- Literacy
- Literacy Development and Language Acquisition
- Literature Reviews
- Mathematics Identity
- Mathematics Instruction and Interventions for Students wit...
- Mathematics Teacher Education
- Measurement for Improvement in Education
- Measurement in Education in the United States
- Meta-Analysis and Research Synthesis in Education
- Methodological Approaches for Impact Evaluation in Educati...
- Methodologies for Conducting Education Research
- Mindfulness, Learning, and Education
- Mixed Methods Research
- Motherscholars
- Motivation
- Multiliteracies in Early Childhood Education
- Multiple Documents Literacy: Theory, Research, and Applica...
- Multivariate Research Methodology
- Museums, Education, and Curriculum
- Music Education
- Narrative Research in Education
- Native American Studies
- Nonformal and Informal Environmental Education
- Note-Taking
- Numeracy Education
- One-to-One Technology in the K-12 Classroom
- Online Education
- Open Education
- Organizing for Continuous Improvement in Education
- Organizing Schools for the Inclusion of Students with Disa...
- Outdoor Play and Learning
- Outdoor Play and Learning in Early Childhood Education
- Pedagogical Leadership
- Pedagogy of Teacher Education, A
- Performance Objectives and Measurement
- Performance-based Research Assessment in Higher Education
- Performance-based Research Funding
- Phenomenology in Educational Research
- Philosophy of Education
- Physical Education
- Play
- Podcasts in Education
- Policy
- Policy Context of United States Educational Innovation and...
- Politics of Education
- Portable Technology Use in Special Education Programs and ...
- Post-humanism and Environmental Education
- Pre-Service Teacher Education
- Problem Solving
- Productivity and Higher Education
- Professional Development
- Professional Learning Communities
- Program Evaluation
- Programs and Services for Students with Emotional or Behav...
- Psychology Learning and Teaching
- Psychometric Issues in the Assessment of English Language ...
- Qualitative Data Analysis Techniques
- Qualitative, Quantitative, and Mixed Methods Research Samp...
- Qualitative Research Design
- Quantitative Research Designs in Educational Research
- Queering the English Language Arts (ELA) Writing Classroom
- Race and Affirmative Action in Higher Education
- Reading Education
- Refugee and New Immigrant Learners
- Relational and Developmental Trauma and Schools
- Relational Pedagogies in Early Childhood Education
- Reliability in Educational Assessments
- Religion in Elementary and Secondary Education in the Unit...
- Researcher Development and Skills Training within the Cont...
- Research-Practice Partnerships in Education within the Uni...
- Response to Intervention
- Restorative Practices
- Risky Play in Early Childhood Education
- Role of Gender Equity Work on University Campuses through ...
- Scale and Sustainability of Education Innovation and Impro...
- Scaling Up Research-based Educational Practices
- School Accreditation
- School Choice
- School Culture
- School District Budgeting and Financial Management in the ...
- School Improvement through Inclusive Education
- School Reform
- Schools, Private and Independent
- School-Wide Positive Behavior Support
- Science Education
- Secondary to Postsecondary Transition Issues
- Self-Regulated Learning
- Self-Study of Teacher Education Practices
- Service-Learning
- Severe Disabilities
- Single Salary Schedule
- Single-sex Education
- Single-Subject Research Design
- Social Context of Education
- Social Justice
- Social Network Analysis
- Social Pedagogy
- Social Science and Education Research
- Social Studies Education
- Sociology of Education
- Standards-Based Education
- Statistical Assumptions
- Student Access, Equity, and Diversity in Higher Education
- Student Assignment Policy
- Student Engagement in Tertiary Education
- Student Learning, Development, Engagement, and Motivation ...
- Student Participation
- Student Voice in Teacher Development
- Sustainability Education in Early Childhood Education
- Sustainability in Early Childhood Education
- Sustainability in Higher Education
- Teacher Beliefs and Epistemologies
- Teacher Collaboration in School Improvement
- Teacher Evaluation and Teacher Effectiveness
- Teacher Preparation
- Teacher Training and Development
- Teacher Unions and Associations
- Teacher-Student Relationships
- Teaching Critical Thinking
- Technologies, Teaching, and Learning in Higher Education
- Technology Education in Early Childhood
- Technology, Educational
- Technology-based Assessment
- The Bologna Process
- The Regulation of Standards in Higher Education
- Theories of Educational Leadership
- Three Conceptions of Literacy: Media, Narrative, and Gamin...
- Tracking and Detracking
- Traditions of Quality Improvement in Education
- Transformative Learning
- Transitions in Early Childhood Education
- Tribally Controlled Colleges and Universities in the Unite...
- Understanding the Psycho-Social Dimensions of Schools and ...
- University Faculty Roles and Responsibilities in the Unite...
- Using Ethnography in Educational Research
- Value of Higher Education for Students and Other Stakehold...
- Virtual Learning Environments
- Vocational and Technical Education
- Wellness and Well-Being in Education
- Women's and Gender Studies
- Young Children and Spirituality
- Young Children's Learning Dispositions
- Young Children's Working Theories