Simpson's Paradox in Psychology
- LAST REVIEWED: 29 November 2022
- LAST MODIFIED: 29 November 2022
- DOI: 10.1093/obo/9780199828340-0301
- LAST REVIEWED: 29 November 2022
- LAST MODIFIED: 29 November 2022
- DOI: 10.1093/obo/9780199828340-0301
Introduction
Simpson’s paradox—also called the reversal paradox and amalgamation paradox—is a statistical phenomenon in which an apparent paradox arises because aggregate data at the group level (or at the level of a set of groups) can support a conclusion that is either not observed or is opposite from that suggested by the same data before aggregation at the individual level (or at the level of groups). The paradox is resolved when the data are stratified by groups in the statistical modeling. An intuitive example of Simpson’s paradox is the correlation between typing speed and typos. At the group level, the correlation is negative—experienced typists type faster and make fewer typos. However, at the individual level, the correlation is positive—the faster an individual types, the greater the number of typos he/she makes. Thus, it would be fallacious to conclude that the relationship between typing speed and typos observed at the group level holds at the individual level. Simpson’s paradox is especially problematic in physical and social sciences, where statistical trends in point data observed at the group level are often fallaciously used to derive inferences about individuals, or relatively less often, the other way round. Hence, equivalence at the group and individual levels must be explicitly tested.
History of Simpson’s Paradox
Simpson 1951 first addressed this phenomenon, showing how combining contingency tables can yield paradoxical conclusions specifically, reporting associations that disappeared upon aggregation, although the earlier works Pearson, et al. 1899 and Yule 1903 noticed a similar phenomenon. Simpson noticed that depending on the story behind the data, the “sensible interpretation” is sometimes compatible with the aggregate population and sometimes disaggregated subpopulations. Twenty years later, Blyth 1972 found that aggregation can even lead to the sign reversal of statistical relationship, and the author labeled the phenomenon as Simpson’s paradox in honor of Simpson, although sign reversal was first noted by Cohen and Nagel 1934. Lindley and Novick 1981 amplified Simpson’s paradox by showing that no statistical criterion can warn against drawing wrong conclusions or indicate whether aggregated or disaggregated data would support the correct conclusion. Critically, Lindley and Novick highlighted that when distinct contexts compel distinct conclusions based on the same data, then our choice of the conclusion must be driven not by statistical considerations but by additional information extracted from the context; that is, it must invoke some form of causality. The interested readers can refer to some of the excellent resources that discuss statistical methods to prevent, diagnose, and treat Simpson’s paradox in statistical point estimates, such as Adolf, et al. 2014; Fisher, et al. 2018; and Kievit, et al. 2013.
Adolf, J., N. K. Schuurman, P. Borkenau, D. Borsboom, and C. V. Dolan. 2014. Measurement invariance within and between individuals: A distinct problem in testing the equivalence of intra- and inter-individual model structures. Frontiers in Psychology 5: 883.
Addresses the equivalence between results obtained at intra-individual and inter-individual levels of psychometric analysis in the context of a linear state-space model, i.e., a time series model with latent variables. Considers invariance constraints under which results can be generalized (i) over time within subjects, (ii) over subjects within occasions, and (iii) over time and subjects simultaneously. Relates problems of time- and subject-equivalence to problems of nonergodicity.
Blyth, C. R. 1972. On Simpson’s paradox and the sure-thing principle. Journal of the American Statistical Association 67.338: 364–366.
DOI: 10.1080/01621459.1972.10482387
Presents a primer of Simpson’s paradox in the mathematical language of probability.
Cohen, M., and E. Nagel. 1934. An introduction to logic and the scientific method. New York: Harcourt, Brace.
Reports sign reversal of the relationship between two variables upon aggregation.
Fisher, A. J., J. D. Medaglia, and B. F. Jeronimus. 2018. Lack of group-to-individual generalizability is a threat to human subjects research. Proceedings of the National Academy of Sciences 115.27: E6106–E6115.
An influential article stating that “statistical findings at the inter-individual (group) level generalize to the intra-individual (person) level only if the process is ergodic,” i.e., the effects remain homogeneous across individuals and stable over time. Shows that ergodicity does not hold in multiple published datasets, threatening human subjects research. Emphasizes that researchers must explicitly test for equivalence of processes at the group and individual level in social and medical sciences.
Kievit, R., W. Frankenhuis, L. Waldorp, and D. Borsboom. 2013. Simpson’s paradox in psychological science: A practical guide. Frontiers in Psychology 4: 513.
Reviews findings from multiple disciplines to argue that Simpson’s paradox is pretty common and typically results in incorrect interpretations with potentially harmful consequences. Shows that Simpson’s paradox is most likely to occur when drawing inferences across different levels of analysis (e.g., from populations to subgroups, or subgroups to individuals). Proposes statistical markers indicative of Simpson’s paradox, and offers psychometric solutions for dealing with it—including a toolbox in R for detecting Simpson’s paradox.
Lindley, D. V., and M. R. Novick. 1981. The role of exchangeability in inference. The Annals of Statistics 9.1: 45–58.
Shows that no statistical criterion would warn the investigator against drawing the wrong conclusions due to Simpson’s paradox or indicate which (subset of) data would lead to the correct conclusion.
Pearson, K., A. Lee, and L. Bramley-Moore. 1899. Genetic (reproductive) selection: Inheritance of fertility in man, and of fecundity in thoroughbred racehorses. Philosophical Transactions of the Royal Society: Series A 192: 257–330.
Presents an extensive discussion of spurious correlations in the case of continuous variables. Shows that pooling two separate records, for each of which the correlation is zero, necessarily creates a spurious correlation unless the mean of at least one of the variables is the same in the two cases.
Simpson, E. H. 1951. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society: Series B (Methodological) 13.2: 238–241.
DOI: 10.1111/j.2517-6161.1951.tb00088.x
The most influential article on Simpson’s paradox and the one which led to the paradox being named Simpson’s paradox. Discuss how in a 2×2×2 contingency table, there may exist associations or interactions of given attributes in pairs and a second-order interaction when considering all three pairs together.
Yule, G. U. 1903. Notes on the theory of association of attributes in statistics. Biometrika 2.2: 121–134.
Probably the earliest documented discussion of Simpson’s paradox, even dating back its name, using, among numerous others, the hypothetical example of an anti-toxin which could appear to be a ‘cure’ due to a sex-related difference in mortality rates.
Users without a subscription are not able to see the full content on this page. Please subscribe or login.
How to Subscribe
Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.
Article
- Abnormal Psychology
- Academic Assessment
- Acculturation and Health
- Action Regulation Theory
- Action Research
- Addictive Behavior
- Adolescence
- Adoption, Social, Psychological, and Evolutionary Perspect...
- Adulthood
- Advanced Theory of Mind
- Affective Forecasting
- Affirmative Action
- Ageism
- Ageism at Work
- Aggression
- Allport, Gordon
- Alzheimer’s Disease
- Ambulatory Assessment in Behavioral Science
- Analysis of Covariance (ANCOVA)
- Anger
- Animal Behavior
- Animal Learning
- Anxiety Disorders
- Art and Aesthetics, Psychology of
- Artificial Intelligence, Machine Learning, and Psychology
- Assessment and Clinical Applications of Individual Differe...
- Attachment in Social and Emotional Development across the ...
- Attention-Deficit/Hyperactivity Disorder (ADHD) in Adults
- Attention-Deficit/Hyperactivity Disorder (ADHD) in Childre...
- Attitudes
- Attitudinal Ambivalence
- Attraction in Close Relationships
- Attribution Theory
- Authoritarian Personality
- Autism
- Bayesian Statistical Methods in Psychology
- Behavior Therapy, Rational Emotive
- Behavioral Economics
- Behavioral Genetics
- Belief Perseverance
- Bereavement and Grief
- Biological Psychology
- Birth Order
- Body Image in Men and Women
- Burnout
- Bystander Effect
- Categorical Data Analysis in Psychology
- Childhood and Adolescence, Peer Victimization and Bullying...
- Clark, Mamie Phipps
- Clinical Neuropsychology
- Clinical Psychology
- Cognitive Consistency Theories
- Cognitive Dissonance Theory
- Cognitive Neuroscience
- Communication, Nonverbal Cues and
- Comparative Psychology
- Competence to Stand Trial: Restoration Services
- Competency to Stand Trial
- Computational Psychology
- Conflict Management in the Workplace
- Conformity, Compliance, and Obedience
- Consciousness
- Coping Processes
- Correspondence Analysis in Psychology
- Counseling Psychology
- Courage
- Creativity
- Creativity at Work
- Critical Thinking
- Cross-Cultural Psychology
- Cultural Psychology
- Daily Life, Research Methods for Studying
- Data Science Methods for Psychology
- Data Sharing in Psychology
- Death and Dying
- Deceiving and Detecting Deceit
- Defensive Processes
- Depression
- Depressive Disorders
- Development, Prenatal
- Developmental Psychology (Cognitive)
- Developmental Psychology (Social)
- Diagnostic and Statistical Manual of Mental Disorders (DSM...
- Discrimination
- Disgust
- Dissociative Disorders
- Drugs and Behavior
- Eating Disorders
- Ecological Psychology
- Ecopsychology
- Educational Settings, Assessment of Thinking in
- Effect Size
- Embodiment and Embodied Cognition
- Emerging Adulthood
- Emotion
- Emotional Intelligence
- Empathy and Altruism
- Employee Stress and Well-Being
- Environmental Neuroscience and Environmental Psychology
- Ethics in Psychological Practice
- Event Perception
- Evolutionary Psychology
- Expansive Posture
- Experimental Existential Psychology
- Exploratory Data Analysis
- Eyewitness Testimony
- Eysenck, Hans
- Factor Analysis
- Festinger, Leon
- Five-Factor Model of Personality
- Flynn Effect, The
- Forensic Psychology
- Forgiveness
- Friendships, Children's
- Fundamental Attribution Error/Correspondence Bias
- Gambler's Fallacy
- Game Theory and Psychology
- Geropsychology, Clinical
- Global Mental Health
- Habit Formation and Behavior Change
- Happiness
- Health Psychology
- Health Psychology Research and Practice, Measurement in
- Heider, Fritz
- Heuristics and Biases
- History of Psychology
- Human Factors
- Humanistic Psychology
- Humor
- Hypnosis
- Implicit Association Test (IAT)
- Industrial and Organizational Psychology
- Inferential Statistics in Psychology
- Insanity Defense, The
- Intelligence
- Intelligence, Crystallized and Fluid
- Intercultural Psychology
- Intergroup Conflict
- International Classification of Diseases and Related Healt...
- International Psychology
- Interviewing in Forensic Settings
- Intimate Partner Violence, Psychological Perspectives on
- Introversion–Extraversion
- Item Response Theory
- Kurtosis
- Language
- Laughter
- Law, Psychology and
- Lazarus, Richard
- Leadership
- Learned Helplessness
- Learning Theory
- Learning versus Performance
- LGBTQ+ Romantic Relationships
- Lie Detection in a Forensic Context
- Life-Span Development
- Lineups
- Locus of Control
- Loneliness and Health
- Mathematical Psychology
- Meaning in Life
- Mechanisms and Processes of Peer Contagion
- Media Violence, Psychological Perspectives on
- Mediation Analysis
- Meditation
- Memories, Autobiographical
- Memories, Flashbulb
- Memories, Repressed and Recovered
- Memory, False
- Memory, Human
- Memory, Implicit versus Explicit
- Memory in Educational Settings
- Memory, Semantic
- Meta-Analysis
- Metacognition
- Metamemory
- Metaphor, Psychological Perspectives on
- Microaggressions
- Military Psychology
- Mindfulness
- Mindfulness and Education
- Minnesota Multiphasic Personality Inventory (MMPI)
- Money, Psychology of
- Moral Conviction
- Moral Development
- Moral Psychology
- Moral Reasoning
- Motivation
- Music
- Narcissism
- Narrative
- Nature versus Nurture Debate in Psychology
- Neuroscience of Associative Learning
- Nonergodicity in Psychology and Neuroscience
- Nonparametric Statistical Analysis in Psychology
- Observational (Non-Randomized) Studies
- Obsessive-Complusive Disorder (OCD)
- Occupational Health Psychology
- Older Workers
- Olfaction, Human
- Operant Conditioning
- Optimism and Pessimism
- Organizational Justice
- Parenting Stress
- Parenting Styles
- Parents' Beliefs about Children
- Path Models
- Peace Psychology
- Perception
- Perception, Person
- Performance Appraisal
- Personality and Health
- Personality Disorders
- Personality Psychology
- Person-Centered and Experiential Psychotherapies: From Car...
- Phenomenological Psychology
- Placebo Effects in Psychology
- Play Behavior
- Positive Psychological Capital (PsyCap)
- Positive Psychology
- Posttraumatic Stress Disorder (PTSD)
- Prejudice and Stereotyping
- Pretrial Publicity
- Prisoner's Dilemma
- Problem Solving and Decision Making
- Procrastination
- Prosocial Behavior
- Prosocial Spending and Well-Being
- Protocol Analysis
- Psycholinguistics
- Psychological Literacy
- Psychological Perspectives on Food and Eating
- Psychology, Political
- Psychoneuroimmunology
- Psychophysics, Visual
- Psychotherapy
- Psychotic Disorders
- Publication Bias in Psychology
- Race
- Reasoning, Counterfactual
- Rehabilitation Psychology
- Relationships
- Reliability–Contemporary Psychometric Conceptions
- Religion, Psychology and
- Replication Initiatives in Psychology
- Research Methods
- Resilience
- Risk Taking
- Role of the Expert Witness in Forensic Psychology, The
- Rumination
- Sample Size Planning for Statistical Power and Accurate Es...
- Savoring
- Schizophrenic Disorders
- School Psychology
- School Psychology, Counseling Services in
- Self, Gender and
- Self, Psychology of the
- Self-Construal
- Self-Control
- Self-Deception
- Self-Determination Theory
- Self-Efficacy
- Self-Esteem
- Self-Monitoring
- Self-Regulation in Educational Settings
- Self-Report Tests, Measures, and Inventories in Clinical P...
- Sensation Seeking
- Sex and Gender
- Sexual Minority Parenting
- Sexual Orientation
- Signal Detection Theory and its Applications
- Simpson's Paradox in Psychology
- Single People
- Single-Case Experimental Designs
- Situational Strength
- Skinner, B.F.
- Sleep and Dreaming
- Small Groups
- Social Class and Social Status
- Social Cognition
- Social Neuroscience
- Social Support
- Social Touch and Massage Therapy Research
- Somatoform Disorders
- Spatial Attention
- Sports Psychology
- Stanford Prison Experiment (SPE): Icon and Controversy
- Stereotype Threat
- Stereotypes
- Stress and Coping, Psychology of
- Student Success in College
- Subjective Wellbeing Homeostasis
- Suicide
- Taste, Psychological Perspectives on
- Teaching of Psychology
- Terror Management Theory
- Testing and Assessment
- The Concept of Validity in Psychological Assessment
- The Neuroscience of Emotion Regulation
- The Reasoned Action Approach and the Theories of Reasoned ...
- The Weapon Focus Effect in Eyewitness Memory
- Theory of Mind
- Therapy, Cognitive-Behavioral
- Thinking Skills in Educational Settings
- Time Perception
- Trait Perspective
- Trauma Psychology
- Twin Studies
- Type A Behavior Pattern (Coronary Prone Personality)
- Unconscious Processes
- Video Games and Violent Content
- Virtues and Character Strengths
- Wisdom
- Women and Science, Technology, Engineering, and Math (STEM...
- Women, Psychology of
- Work Well-Being
- Workforce Training Evaluation
- Wundt, Wilhelm