In This Article Expand or collapse the "in this article" section Data Science Methods for Psychology

  • Introduction
  • Overviews of Data Science for Psychologists

Psychology Data Science Methods for Psychology
Jeffrey Stanton
  • LAST REVIEWED: 26 February 2020
  • LAST MODIFIED: 26 February 2020
  • DOI: 10.1093/obo/9780199828340-0259


The term “data science” refers to an emerging field of research and practice that focuses on obtaining, processing, visualizing, analyzing, preserving, and re-using large collections of information. A related term, “big data,” has been used to refer to one of the important challenges faced by data scientists in many applied environments: the need to analyze large data sources, in certain cases using high-speed, real-time data analysis techniques. Data science encompasses much more than big data, however, as a result of many advancements in cognate fields such as computer science and statistics. Data science has also benefited from the widespread availability of inexpensive computing hardware—a development that has enabled “cloud-based” services for the storage and analysis of large data sets. The techniques and tools of data science have broad applicability in the sciences. Within the field of psychology, data science offers new opportunities for data collection and data analysis that have begun to streamline and augment efforts to investigate the brain and behavior. The tools of data science also enable new areas of research, such as computational neuroscience. As an example of the impact of data science, psychologists frequently use predictive analysis as an investigative tool to probe the relationships between a set of independent variables and one or more dependent variables. While predictive analysis has traditionally been accomplished with techniques such as multiple regression, recent developments in the area of machine learning have put new predictive tools in the hands of psychologists. These machine learning tools relax distributional assumptions and facilitate exploration of non-linear relationships among variables. These tools also enable the analysis of large data sets by opening options for parallel processing. In this article, a range of relevant areas from data science is reviewed for applicability to key research problems in psychology including large-scale data collection, exploratory data analysis, confirmatory data analysis, and visualization. This bibliography covers data mining, machine learning, deep learning, natural language processing, Bayesian data analysis, visualization, crowdsourcing, web scraping, open source software, application programming interfaces, and research resources such as journals and textbooks.

Overviews of Data Science for Psychologists

As an emerging and rapidly evolving field, data science presents a moving target with respect to its applicability to psychology and other scientific fields. Researchers began to realize the applicability of data science techniques to psychology early in the 2010s. For example, Tonidandel, et al. 2015 is an edited book with chapters describing applications of data science to industrial and organizational psychology. In a more recent edited book on technology in organizations, Landers 2019 offers chapters on crowdsourcing and artificial intelligence, among other areas. Markowetz, et al. 2014 forecasts how digital devices such as smart phones were poised to revolutionize the collection of large-scale behavioral and social data in ways that would improve clinical research. From a neuroscience perspective, Gomez-Marin, et al. 2014 argues that large scale data collection about the brain and behavior could lead to the development of “ethomes”—complete descriptions of the behavior patterns of a species. Cheung and Jak 2016 provides an overview of the intersection between psychology and data science. It recommends a specific research strategy that would leverage the power of meta-analysis to make sense out of multiple study replications sampled from a large data source.

  • Cheung, M. W. L., and S. Jak. 2016. Analyzing big data in psychology: A split/analyze/meta-analyze approach. Frontiers in Psychology 7:738.

    Offers sampling and analytical guidance for psychological researchers who have access to larger data sets.

  • Gomez-Marin, A., J. J. Paton, A. R. Kampff, R. M. Costa, and Z. F. Mainen. 2014. Big behavioral data: Psychology, ethology and the foundations of neuroscience. Nature Neuroscience 17.11: 1455.

    Argues that technology provides automated methods for gathering large-scale information about behavior in natural environments and that creating open databases of such behavioral data could advance neuroscience.

  • Landers, R. N., ed. 2019. The Cambridge handbook of technology and employee behavior. Cambridge, UK: Cambridge Univ. Press.

    Offers thirty-three chapters on technology in the workplace that includes several applications of data science to employment-related issues.

  • Markowetz, A., K. Błaszkiewicz, C. Montag, C. Switala, and T. E. Schlaepfer. 2014. Psycho-informatics: Big data shaping modern psychometrics. Medical Hypotheses 82.4: 405–411.

    Describes how mobile devices and other sources of large-scale behavioral trace data may transform psychology and psychiatry.

  • Tonidandel, S., E. B. King, and J. M. Cortina, eds. 2015. Big data at work: The data science revolution and organizational psychology. New York: Routledge.

    Edited volume with a wide range of topics at the intersection of organizational psychology and data science.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.