In This Article Expand or collapse the "in this article" section The Concept of Validity in Psychological Assessment

  • Introduction
  • Perspectives on Validity
  • Professional Standards for Psychological Testing

Psychology The Concept of Validity in Psychological Assessment
by
Hedwig Teglasi, Hailey Mae Fleece, Mazneen Cyrus Havewala, Diksha Bali
  • LAST REVIEWED: 12 January 2023
  • LAST MODIFIED: 12 January 2023
  • DOI: 10.1093/obo/9780199828340-0304

Introduction

The concept of validity is central to psychological assessment, providing the theoretical and methodological principles for the development and use of measurement instruments. A consensus has emerged that validity does not reside in the measuring instrument per se, but rather in the inferences drawn from the scores. However, the concept of validity is complex and its intricacies continue to be debated. In this chapter, we aim to represent the diversity of perspectives on the concept of validity and to translate the implications of these perspectives for psychological assessment. There has been a movement away from the historical emphasis on types of validity (e.g., content, criterion, construct) and from the view of reliability as distinct from, but related to, validity. These terms have been reconceptualized as different forms of evidence gathered through the process of validation to support the claim to validity. This “unitary” approach to validity has gained traction, though some texts still refer to different types of validity. What counts as evidence in support of validity depends on the basic assumptions about what is being measured (substantive theory) and about how it should be measured (measurement theory). Substantive theory addresses the nature of the phenomena under consideration (i.e., realist and constructivist perspectives) and measurement theory addresses the principles and procedures to quantify psychological phenomena and to gather evidence of validity. Measurement theory enjoys considerable consensus, but questions regarding substantive theory remain unsettled. Quantitative and qualitative measurement share commonalities in the conception of validity but rely on different validation procedures. As emphasized in the Standards for testing endorsed by educational and psychological professional associations, decisions based on tests are consequential for people’s lives, warranting consideration of all available evidence, including issues of bias and fairness in the interpretation and the use of scores. Yet when considering the implications of test scores for decision-making, it is hard to escape basic questions about the nature of the phenomenon being measured by a particular method. Gaps between substantive theory and validation procedures, including use of metrics that don’t adequately represent the target phenomenon, reduce the usefulness of conclusions drawn. Different instruments purporting to measure the same phenomenon may capture different aspects of that phenomenon or may apply in different contexts (hence, low agreement is not fully explained by measurement error). Since clinical assessments often include batteries of tests that get at different psychological phenomena relevant to the issues at hand, a more complete conception of validity in psychological assessment would reach beyond the current emphasis on validation of single test scores.

Perspectives on Validity

The question of validity arises whenever psychologists attempt to measure a phenomenon of interest, particularly one that is not directly observable (e.g., intelligence), as discussed in Borsboom, et al. 2003. A definition of validity as the extent to which scores on the measure capture what is intended would suggest that what counts as evidence of validity depends on what is claimed for the measure. For that reason, it is helpful to keep in mind the distinction between the terms validity and validation: the former referring to assumptions about what is being measured, and the latter referring to assumptions about how it should be measured (see Borsboom, et al. 2004). Preferences for particular validation activities are bound up with the philosophy of science, infusing the concept of validity with complexities that continue to be debated (see Lissitz 2009, Markus and Borsboom 2013, Newton and Shaw 2013). Finally, there is also debate about how to incorporate the social consequences of testing into the concept of validity, as seen in Cizek 2012, Kane 2013, and Messick 1998.

  • Borsboom, D., G. J. Mellenbergh, and J. van Heerden. 2003. The theoretical status of latent variables. Psychological Review 110.2:203–219.

    DOI: 10.1037/0033-295X.110.2.203

    The authors discuss the status of psychological attributes that are not directly seen, called latent variables, in test theory models, arguing that interpretation of such models is consistent with an assumption that the variable exists apart from its measures. Causal relations between the variable and scores on their measures in latent variable models apply at the group but not the individual level. The authors emphasize the need to explicitly represent intraindividual processes in the measurement model.

  • Borsboom, D., G. J. Mellengbergh, and J. van Heerden. 2004. The concept of validity. Psychological Review 111:1061–1071.

    DOI: 10.1037/0033-295X.111.4.1061

    In this seminal contribution, the authors argue that an instrument is valid under two conditions: the test measures a phenomenon that exists and that causes variation on responses. They detail the implications of this definition of validity for validation, defined as the process of gathering empirical evidence to support the claim to validity.

  • Cizek, G. J. 2012. Defining and distinguishing validity: Interpretations of score meaning and justifications of test use. Psychological Methods 17.1:31–43.

    DOI: 10.1037/a0026975

    Cizek argues that Messick’s framework (Messick 1989), which subsumes the meaning and the use of the scores among four facets of validity, muddies the validity concept. Therefore, alternative frameworks are needed that treat score meaning and justification of test use as distinct concerns.

  • Kane, M. T. 2013. Validating the interpretations and uses of test scores. Journal of Educational Measurement 50.1:1–73.

    DOI: 10.1111/jedm.12000

    The author’s “argument”-based approach to validation proposes that score interpretations, not the test, is validated, and that evidence for both test interpretation and test use is necessary.

  • Lissitz, R. W., ed. 2009. The concept of validity: Revisions, new directions, and applications. Charlotte, NC: Information Age.

    This edited book presents the complexities of the validity concept through contributions of respected scholars espousing diverging perspectives who clearly articulate their positions. The reader comes to appreciate the basic and yet to be settled questions about the concept of validity.

  • Markus, K., and D. Borsboom. 2013. Frontiers of validity theory: Measurement, causation, and meaning. New York: Routledge.

    DOI: 10.4324/9780203501207

    The authors examine three fundamental issues related to test validity in the behavioral, social, and educational sciences (measurement, causation, and meaning), providing psychometric and philosophical perspectives and citing unresolved issues. The book highlights conceptual and practical challenges to test construction and use.

  • Messick, S. 1998. Test validity: A matter of consequence. In Validity theory and the methods used in validation: Perspectives from the social and behavioral sciences. Edited by D. B. Zumbo, 35–44. Amsterdam: Kluwer Academic Press.

    The author argues for the necessity to establish the consequences of test use and interpretation in the validation process.

  • Newton, P. E., and S. D. Shaw. 2013. Standards for talking and thinking about validity. Psychological Methods 18.3:301–319.

    DOI: 10.1037/a0032969

    Authors argue that the two fundamental principles for talking about validity espoused in the standards for educational and psychological testing have been consistently ignored and suggest that obtaining a technical definition of validity may not be a feasible option, and instead it may be more practical to focus on the overall quality of a test.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.

Article

Up

Down