In This Article Expand or collapse the "in this article" section Causal Inference

  • Introduction
  • Textbooks
  • Paper-Length Introductions
  • Journals
  • Longitudinal Data and Causal Inference
  • Networks and Spillover Effects
  • Heterogeneous Treatment Effects
  • Causal Mediation Analysis
  • Causal Inference, Machine Learning, and Big Data

Sociology Causal Inference
Pablo Geraldo Bastías, Jennie E. Brand
  • LAST MODIFIED: 26 February 2020
  • DOI: 10.1093/obo/9780199756384-0240


Causal inference is a growing interdisciplinary subfield in statistics, computer science, economics, epidemiology, and the social sciences. In contrast with both traditional quantitative methods and cutting-edge approaches like machine learning, causal inference questions are defined in relation to potential outcomes, or variable values that are counterfactual to the observed world and therefore cannot be answered from joint probabilities alone, even with infinite data. The fact that one can possibly observe at most one potential outcome among those of interest is known as the “fundamental problem of causal inference.” For example, in this framework, the economic return to college education can be defined as a comparison between two potential outcomes: the wages of an individual with a college education versus the wages that the same individual would have received had he or she not attended college. In general, researchers are interested in estimating such effects for certain groups and comparing the effects for different subpopulations. Critical to causal inference is recognizing that, to answer causal questions from observed data, one has to rely on untestable assumptions about how the data were generated. In other words, there is no particular statistical method that would render a conclusion “causal”; the validity of such an interpretation depends on a combination of data, assumptions about the data-generating process based on expert judgment, and estimation techniques. In the last several decades, our understanding of causality has improved enormously, owing to a conceptual apparatus and a mathematical language that enables rigorous conceptualization of causal quantities and formal representation of causal assumptions, while still employing familiar statistical methods. Potential outcomes or the Neyman-Rubin causal model and structural equations encoded as directed acyclic graphs (DAGs, also known as structural causal models) are two common approaches for conceptualizing causal relationships. The symbiosis of both languages offers a powerful framework to address causal questions. This review covers developments in both causal identification (i.e., deciding if a quantity of interest would be recoverable from infinite data, based on our assumptions) and causal effect estimation (i.e., the use of statistical methods to approximate that answer with finite, although potentially big, data). The literature is presented following the type of assumptions and questions frequently encountered in empirical research, ending with a discussion of promising new directions in the field.


As the field of causal inference has consolidated, there are now several introductory textbooks at basic and advanced levels covering the fundamentals of causal inference for social scientists. A gentle and at the same time comprehensive introduction is offered by Morgan and Winship 2015, covering both potential outcomes and causal graphs. A comprehensive collection of chapters, with examples and applications, can be found in Morgan 2013. Slightly more technical is Hong 2015, a text that focuses on weighting estimators using potential outcomes. Angrist and Pischke 2009 offers detailed discussions of the methods more frequently used by researchers in economics, and Imbens and Rubin 2015 offers a comprehensive introduction to the potential outcomes model—in both cases at a more technical level of exposition. The scope of Rosenbaum 2010 is restricted to observational studies, with a formal but not overly technical treatment. All the previous references offer some level of combination between identification and estimation of causal effects. On the other hand, the available introductions to the structural causal model are exclusively focused on identification of causal effects, where the graphical approach has its major strength. The best and most accessible introduction to the structural causal model can be found in Pearl, et al. 2016, while the more challenging canonical and in-depth exposition of this approach can be found in Pearl 2009.

  • Angrist, Joshua D., and Jörn-Steffen Pischke. 2009. Mostly harmless econometrics: An empiricist’s companion. Princeton, NJ: Princeton Univ. Press.

    DOI: 10.1515/9781400829828E-mail Citation »

    An atypical approach to econometrics, this book focuses on the empirical strategies most widely used in applied research, including regression analysis, instrumental variables, difference-in-differences, regression discontinuity, and quantile regression. The level of exposition requires familiarity with basic probability and statistics.

  • Hong, Guanglei. 2015. Causality in a social world: Moderation, meditation and spill-over. Chichester, UK: John Wiley & Sons.

    DOI: 10.1002/9781119030638E-mail Citation »

    An extensive coverage of causal inference in social settings, with emphasis on weighting methods to adjust for confounding, moderation, mediation, and spillover effects, based on the author’s own methodological contributions. The main text offers an accessible presentation, while the appendices include the formal derivation of the results.

  • Imbens, Guido W., and Donald B. Rubin. 2015. Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge, UK: Cambridge Univ. Press.

    DOI: 10.1017/CBO9781139025751E-mail Citation »

    A detailed introduction to the potential outcome framework by two of the leading figures of the approach in economics and statistics. The first chapters offer a conceptual introduction, followed by an extensive treatment of identification and estimation in randomized experiments and observational studies, with a particular focus on matching and propensity scores, and instrumental variables.

  • Morgan, Stephen L., ed. 2013. Handbook of causal analysis for social research. Handbooks of Sociology and Social Research. Dordrecht, The Netherlands: Springer.

    E-mail Citation »

    The book contains nineteen chapters covering different research designs and tools for causal inference by renowned experts in the area. Social scientists will find particularly useful the chapters on mixed methods, causal effect heterogeneity, graphical causal models, social networks, and sensitivity analysis.

  • Morgan, Stephen L., and Christopher Winship. 2015. Counterfactuals and causal inference: Methods and principles for social research. 2d ed. Analytical Methods for Social Research. New York: Cambridge Univ. Press.

    E-mail Citation »

    An introduction to causal inference in observational settings for a general social science audience, presenting a unified approach drawing both from the Potential Outcomes and Structural Causal Model perspectives. The minimal mathematical and statistical requirements, and the clear exposition of identification assumptions for different designs, make this book well suited as an introduction to the topic.

  • Pearl, Judea. 2009. Causality. Cambridge, UK: Cambridge Univ. Press.

    DOI: 10.1017/CBO9780511803161E-mail Citation »

    A technical introduction to the structural causal model, this is the fundamental book for interest in inferring causal effects from non-experimental data using causal graphs. It shows how potential outcomes can be derived from structural models, and the logical equivalence of the two frameworks. The level of exposition makes this book ideal for readers already familiar with the approach.

  • Pearl, Judea, Madelyn Glymour, and Nicholas P. Jewell. 2016. Causal inference in statistics: A primer. Chichester, UK: Wiley.

    E-mail Citation »

    The most accessible, yet still rigorous, introduction to the structural causal model approach, including the use of directed acyclic graphs to encode researchers’ assumptions. This text covers the identification of effects of interventions, mediation, and causal attribution. It also includes a useful probability review chapter.

  • Rosenbaum, Paul R. 2010. Design of observational studies. Springer Series in Statistics. New York: Springer.

    DOI: 10.1007/978-1-4419-1213-8E-mail Citation »

    A detailed exposition of causal inference in observational studies with emphasis in research design, matching methods, and sensitivity analysis, by one of the leading advocates of the Neyman-Rubin approach. More technical than other books in this section, this text requires some background in probability and statistics.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.