Psychology Replication Initiatives in Psychology
by
Jennifer Bastart, Richard A. Klein, Hans IJzerman
  • LAST MODIFIED: 24 May 2018
  • DOI: 10.1093/obo/9780199828340-0212

Introduction

Replication is one key component toward a robust cumulative knowledge base. It plays a critical function in assessing the stability of the scientific literature. Replication involves closely repeating the procedure of a study and determining if the results are similar to the original. For decades, behavioral scientists were reluctant to publish replications. Reasons were epistemic and pragmatic. First of all, original studies were viewed as conclusive in most cases, and failures to replicate were often attributed to mistakes by the replicating researcher. In addition, failures to replicate may be caused by numerous factors. This inherent ambiguity made replications less desirable to journals. On the other hand, replication successes were expected and considered to contribute little beyond what was already known. Finally, editorial policies did not encourage the publication of replications, leaving the robustness of scientific findings largely unreported. A series of events ultimately led the research community to reconsider replication and research practices at large: the discovery of several cases of large-scale scientific misconduct (i.e., fraud); the invention and application of new statistical tools to assess strength of evidence; high-profile publications suggesting that some common practices may be less robust than previously assumed; failure to replicate some major findings of the field; and the creation of new, online tools aimed to promote transparency in the field. To deal with what is often regarded as the crisis of confidence, initiatives have been developed to increase the transparency of research practices, including (but not limited to) preregistration of studies; effect size predictions and sample size/power estimation; and, of course, replications. Replication projects themselves evolved in quality: from replications that were originally as small in sample as problematically small original studies to large-scale “Many Labs” collaborative projects. Ultimately, the development of higher-quality replication projects and open science tools has led (and will continue to lead) to a clearer understanding of human behavior and cognition and have contributed to a clearer distinction between exploratory and confirmatory behavioral science. The current article gives an overview of the history of replications, of the development of tools and guidelines, and of review papers discussing theoretical implications of replications.

A Historical Perspective on Replication in Psychological Science

Smith 1970 alerted the community to the lack of replication practices in psychology, and reminded the community of the importance of replication for cumulative science. More than forty years later, Makel, et al. 2012 reiterates this concern, showing that from 1900 to 2012, only 1.07 percent of the published research were replications. Scholars may have been reluctant to conduct and publish replications because of the difficulty in interpreting replication failures as well as the community’s preference for clear and easy patterns/conclusions from data. Giner-Sorolla 2012 for example describes how the preference for positive (i.e., significant) results and well-written narratives may have introduced new publication bias and decreased reproducibility. Greenwald 1975 explains how a bias against publishing null results may discourage researchers from pursuing research involving null hypotheses, resulting in an unrepresentative literature (e.g., a greater proportion of papers that erroneously reject the null). A publication in 2016 of a “replication” of Stapel and Semin 2007 made clear, however, that replications were often completed, yet went unpublished (IJzerman, et al. 2015). Nosek, et al. 2012 argues that the struggle between innovation and accumulation in science led scholars to neglect close replication practices. Preferences for clear patterns from data and difficulty interpreting replication data manifested in editorial policies, which, according to Neuliep and Crandall 1990, encourage the publication of new findings over replications.

The Crisis of Confidence

The crisis of confidence picked up some steam in 2005, when Ioannidis investigated consequences of underpowered studies, conflicts of interests, selective publication of results, and scientific misconduct (see Ioannidis 2005). Ioannidis modeled the effects through simulations with various parameter assumptions and concluded that for most research designs in most fields, most of the published literature is false. In psychology, there were three seminal events that led psychologists to reconsider their scientific practices and to refocus their attention on replications. The first was the publication of a scientific article, Bem 2011, in a well-respected journal claiming that people had “precognition,” or the ability to foresee the future before it happens. The second was a demonstration in Simmons, et al. 2011 that showed how flexible research practices could make it easy for researchers to find “statistical significance” in randomly generated data sets at unexpectedly high rates. The third was the discovery that a prolific Dutch researcher, Diederik Stapel, had mass fabricated data (Levelt, et al. 2012). One of the first steps taken to address the crisis of confidence was the publication of a special issue in the journal Social Psychology, Nosek and Lakens 2014, which exclusively contained preregistered replications, including the first “Many Labs” replication project (Klein, et al. 2014). Perhaps the most high-profile indicator of the crisis of confidence was Open Science Collaboration 2015, the Reproducibility Project: Psychology. This project included replications of one hundred pseudo-randomly selected findings from high-profile psychology journals, which successfully replicated only 36 percent of the studies. Spellman 2015 offers a review of the circumstances that led to the current crisis of confidence; the author argued that many of the causes may be attributable to a lack of technological resources and that changes in technology may facilitate progress toward more solid research practices (and, ultimately, to greater reproducibility).

  • Bem, D. J. 2011. Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect. Journal of Personality and Social Psychology 100.3: 407–425.

    DOI: 10.1037/a0021524Save Citation »Export Citation »E-mail Citation »

    This article reports nine studies, comprising over one thousand participants, ostensibly showing consistent evidence that participants can foresee the future (i.e., “precognition” or ESP). The implausibility of such a result combined with the seemingly robust (for the time) evidence challenged the credibility of accepted methodological practices and evidentiary guidelines in psychological research. Available online.

    Find this resource:

    • Ioannidis, J. P. A. 2005. Why most published research findings are false. PLoS Medicine 2.8: 0696–0701.

      DOI: 10.1371/journal.pmed.0020124Save Citation »Export Citation »E-mail Citation »

      This paper reviews several factors that may contribute to false positive findings in the published literature. The author conducts simulations modeling the effects of various research methods and publication practices, and concludes that the most likely result is that half of all published research findings, in most scientific fields, are likely false.

      Find this resource:

      • Klein, R. A., K. A. Ratliff, M. Vianello, et al. 2014. Investigating variation in replicability: A “Many Labs” replication project. Social Psychology 45.3: 142–152.

        DOI: 10.1027/1864-9335/a000178Save Citation »Export Citation »E-mail Citation »

        This research project attempts to replicate thirteen classic effects in social psychology. Replications were conducted over 6,344 participants and thirty-six laboratories. Replications were clearly successful for ten of the effects. The authors also investigate how variability in the sample or in the setting of the experiment may account for variability in replication results, with analyses suggesting that this sort of heterogeneity did not play a large factor in replicability.

        Find this resource:

        • Levelt, P., E. Noort, and P. Drenth. 2012. Flawed science: The fraudulent research practices of social psychologist Diederik Stapel.

          Save Citation »Export Citation »E-mail Citation »

          This report, authored by the committees in charge of investigating possible academic misconduct by Diederik Stapel, reviews their task, the methods they used to investigate, and the results of these investigations. The report concludes that Stapel committed extensive fraud across many articles.

          Find this resource:

          • Open Science Collaboration. 2015. Estimating the reproducibility of psychological science. Science 349.6251.

            Save Citation »Export Citation »E-mail Citation »

            This collaboration pseudo-randomly selected studies to replicate from several highly regarded journals in psychology. The article presents replication attempts of one hundred of these findings, with results showing that only 36 percent of the replications they conducted had significant results supporting the original papers. Available online with registration.

            Find this resource:

            • Nosek, B. A., and D. Lakens, eds. 2014. Editorial. In Special issue: Registered reports; A method to increase the credibility of published results. Social Psychology 45.3: 137–141.

              Save Citation »Export Citation »E-mail Citation »

              This editorial presents the first special issue devoted solely to Registered Reports replications. Registered Reports is a special publication format where a study is proposed and peer reviewed prior to data collection. If the paper passed the review, it was “conditionally accepted” for publication, regardless of results, assuming the authors followed through with the proposal.

              Find this resource:

              • Simmons, J. P., L. D. Nelson, and U. Simonsohn. 2011. False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science 22.11: 1359–1366.

                DOI: 10.1177/0956797611417632Save Citation »Export Citation »E-mail Citation »

                This article demonstrates how flexibility in data analysis (e.g., selectively including or excluding data points, running multiple comparisons, selectively stopping data collection) increases the rate of false positive results. The authors present simulations where these “questionable research practices” can allow a researcher to find statistically significant (p < .05) results at alarmingly high rates, even given a randomly generated null data set (i.e., potentially over 50 percent of the time).

                Find this resource:

                • Spellman, B. A. 2015. A short (personal) future history of revolution 2.0. Perspectives on Psychological Science 10:886–899.

                  DOI: 10.1177/1745691615609918Save Citation »Export Citation »E-mail Citation »

                  Review of the conjunction of events that lead to the replication crisis of the beginning of the 21st century. The author considers how this crisis can be perceived as an opportunity to rethink scientific practices. Most importantly, the author emphasizes how the “methods revolution” will likely improve research this time, perhaps most prominently because of the technological evolution of our field, which supports the democratization of solid research practices.

                  Find this resource:

                  Understanding Replications: From Conceptual to Close

                  For most of psychology’s history, when researchers conducted replications, they focused on what were called “conceptual replications.” The goal of conceptual replications was to show breadth—and thus generalizability—of the effect (i.e., purposefully changing aspects of the procedure or materials to apply the phenomenon to a new context). Changing the focus to the verifiability of an effect, researchers started to conduct “close” replications. Some argued for a preference of one over the other. Schmidt 2009 discusses how close (or, in the author’s term, exact) replications serve a different function than conceptual replications. Crandall and Sherman 2016 on the other hand argues that conceptual replications are superior to close replications, as they can rely on modifications of some of the central parameters of the effect, testing the mechanism in another context to test the generalizability of the mechanism. Although conceptual replications have long been preferred in the published literature, Cesario 2014 argues that conceptual replications cannot substitute for exact replication and can even perpetuate false positive findings when research practices are imprecise. Others recognized that the ideal of exact replication—reproducing an original study in all of its features—is unreachable in psychology (e.g., testing the exact same sample at the same moment in time is impossible). Inferences of close replications can then be further bolstered by collaborations between several laboratories (Klein, et al. 2014, cited under Crisis of Confidence). On a similar note, Gómez, et al. 2010 reviews classification types to propose a replication scheme for software engineering. They proposed five features that may differ from original to replication (site, experimenter, apparatus, operationalizations, and population properties) and identified four purposes of replications (control for sampling error, control for statistical artifacts, determination of limits of operationalizations, and determination of limits of population properties). In one of the most recent interpretations, Brandt, et al. 2014 suggests that replications fall on a continuum from close to conceptual, and provide a “Replication Recipe” to more accurately capture differences between original and replication studies.

                  • Brandt, M. J., H. IJzerman, A. Dijksterhuis, et al. 2014. The replication recipe: What makes for a convincing replication? Journal of Experimental Social Psychology 50.1: 217–224.

                    DOI: 10.1016/j.jesp.2013.10.005Save Citation »Export Citation »E-mail Citation »

                    This article argues the term “close replication” is better than the term “exact” replication because a replication can be as close as possible to the original study, but can never fully duplicate the original conditions. It further specifies the necessary ingredients for a high-quality replication.

                    Find this resource:

                    • Cesario, J. 2014. Priming, replication, and the hardest science. Perspectives on Psychological Science 9.1: 40–48.

                      DOI: 10.1177/1745691613513470Save Citation »Export Citation »E-mail Citation »

                      On the basis of controversies regarding “social priming” effects, this author argues in favor of exact replication, and proposes that conceptual replication can provide additional evidence but cannot be a substitute for exact replication in assuring reproducibility. Available online.

                      Find this resource:

                      • Crandall, C. S., and J. W. Sherman. 2016. On the scientific superiority of conceptual replications for scientific progress. Journal of Experimental Social Psychology 66:93–99.

                        DOI: 10.1016/j.jesp.2015.10.002Save Citation »Export Citation »E-mail Citation »

                        This paper argues in favor of conceptual replications instead of close replications. The authors discuss how conceptual replications test theoretical variables, whereas close replications test operational variables. The authors conclude that science is self-correcting and robust, and that researchers should have the choice of performing conceptual or close replications, depending on their specific circumstances. Available online.

                        Find this resource:

                        • Gómez, O. S., N. Juristo, and S. Vegas. 2010. Replications types in experimental disciplines. Empirical Software Engineering and Measurement 38.23: 9772–9782.

                          Save Citation »Export Citation »E-mail Citation »

                          The authors perform a systematic investigation into how various scientific disciplines classify replications. Based on their findings, the authors propose a classification system for replications in software engineering. However, being that the authors sampled various domains, these classifications may be broadly useful as a categorization system. Available online.

                          Find this resource:

                          • Schmidt, S. 2009. Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology 13.2: 90–100.

                            DOI: 10.1037/a0015108Save Citation »Export Citation »E-mail Citation »

                            This article notes that the definition of replication is ambiguous and attempts to clarify close versus conceptual replications. The author emphasizes several aims of replication and proposes a more structured approach to conducting replications, depending on the specific aims of the project. The author also proposes preregistration as a key element in design. Available online.

                            Find this resource:

                            Interpreting Replication Results

                            One of the main reasons many researchers held a negative view of replications was because of the difficulty in interpreting results. In years past, researchers thought that interpreting replication results was nearly impossible, particularly if results were not significant. Indeed, a near infinite number of factors could explain null results. The view that replications, therefore, were not useful started changing in mainstream research in the period after 2012. Some researchers concluded, based on null results from one lab, that replications could render the original phenomenon “an elusive phenomenon” (e.g., Shanks, et al. 2013). Others estimated reproducibility based on single-shot replications. Based on a large-scale replication project, the Reproducibility Project: Psychology (cited under Early Collaborative Replication Attempts) sought to estimate the reproducibility of psychological science. The researchers involved in Open Science Collaboration 2015 conclude that the rate of reproducibility is low, but note that “success” or “failure” in replication is difficult to define, and end up using multiple markers to converge on an answer. Others have thus suggested that the term reproducibility may not be entirely accurate: beyond biased original results (commonly referred to as “false positives”), many factors may account for replication failures. Among them, Trafimow and Earp 2016 points to auxiliary assumptions held to conduct the original study; Klein 2014 points to contextual factors or untheorized moderators; and Luttrell, et al. 2017 points to recent advances in theories. IJzerman, et al. 2013 suggests that, ultimately, replication failures can contribute to theory improvement. Thus summarized, the field focused on why studies failed to replicate based on statistical and methodological information of the parameters of studies. According to Hüffmeier, et al. 2016, this approach was statistical and methodological: for one, replications focused on the estimation of bias in procedures and the investigation of the role of moderators and covariates. Altogether, close replications became relevant to provide information on the stability of the effect (thus focusing on methodological and statistical details) and to spur theory development by pointing to contexts where effects could not replicate, whereas conceptual replications became relevant to provide information about the breadth of the theory.

                            • Hüffmeier, J., J. Mazei, and T. Schultze. 2016. Reconceptualizing replication as a sequence of different studies: A replication typology. Journal of Experimental Social Psychology 66:81–92.

                              DOI: 10.1016/j.jesp.2015.09.009Save Citation »Export Citation »E-mail Citation »

                              According to this typology of replication, failures or partial failures have different implications depending on the type of replication. Exact and close replications provide information regarding the reliability of an effect, whereas constructive and conceptual replications provide information on contextual moderators, and on the applicability of the theory on the field. Available online.

                              Find this resource:

                              • IJzerman, H., M. J. Brandt, and J. van Wolferen. 2013. Rejoice! In replication. European Journal of Personality 127:128–129.

                                Save Citation »Export Citation »E-mail Citation »

                                This commentary highlights why replications are difficult to conduct, but are critically valuable nevertheless. It also explains how interpreting failures to replicate can be theoretically consequential because failures help to indicate the boundaries of the investigated effect and thus open the door to theoretical improvement. The authors propose data sharing and inclusion of replication in teaching as solutions to these challenges.

                                Find this resource:

                                • Klein, S. B. 2014. What can recent replication failures tell us about the theoretical commitments of psychology? Theory & Psychology 24.3: 326–338.

                                  DOI: 10.1177/0959354314529616Save Citation »Export Citation »E-mail Citation »

                                  This article highlights several factors that can impact the conclusion of a replication. In particular, it focuses on sample variability and contextual moderators. Ultimately, it argues that a lack of specificity in psychological theories may account for replication failures. Available online.

                                  Find this resource:

                                  • Luttrell, A., R. E. Petty, and M. Xu. 2017. Replicating and fixing failed replications?: The case of need for cognition and argument quality. Journal of Experimental Social Psychology 69:178–183.

                                    DOI: 10.1016/j.jesp.2016.09.006Save Citation »Export Citation »E-mail Citation »

                                    This article responds to a failed replication in Ebersole, et al. 2016 (cited under Recommendations for Best Replication Practices and Replication Initiatives in Social Psychology). The authors propose that the failed replication was due to a suboptimal procedure. The authors conducted a new study manipulating the features they determined to be critical, and successfully replicate the original study only in the “optimal” version of the procedure. This emphasizes that theory and domain expertise may be critical factors in replication. Available online.

                                    Find this resource:

                                    • Open Science Collaboration. 2015. Estimating the reproducibility of psychological science. Science 349.6251.

                                      Save Citation »Export Citation »E-mail Citation »

                                      This article describes the replication of one hundred experiments in papers published in 2008 in three high-ranking psychology journals. About one-third to one-half of the original findings were replicated in the Open Science Collaboration’s project, depending on the exact criteria for defining “success” in replication. Available online.

                                      Find this resource:

                                      • Shanks, D. R., B. R. Newell, E. H. Lee, et al. 2013. Priming intelligent behavior: An elusive phenomenon. PLoS ONE 8.4: 1–10.

                                        DOI: 10.1371/journal.pone.0056515Save Citation »Export Citation »E-mail Citation »

                                        This article presents nine replications of studies priming intelligence and investigating subsequent behavioral changes. Although each study had relatively small samples, aggregate results using Bayesian analyses supported the null hypothesis of no effect.

                                        Find this resource:

                                        • Trafimow, D., and B. D. Earp. 2016. Badly specified theories are not responsible for the replication crisis in social psychology: Comment on Klein. Theory & Psychology 26.4: 540–548.

                                          DOI: 10.1177/0959354316637136Save Citation »Export Citation »E-mail Citation »

                                          This article responds to Klein 2014, arguing that bad theory cannot fully account for the lack of replication in psychology because a given phenomenon exists regardless of its theoretical explanation. Instead, it points to a lack of specificity in auxiliary assumptions about the conditions necessary to obtain an effect as responsible for the lack of replication. Available online.

                                          Find this resource:

                                          Best Research and Replication Practices

                                          These two subsections present the efforts of the community to improve the reliability of psychological science. The first section introduces current concerns about best research practices in general, which is followed by a section on replication practices that rely on the same guidelines.

                                          Recommendations for Best Research Practices

                                          To answer the current crisis of confidence, the Open Science Collaboration 2017 provides methodological guidelines and technologies to promote transparency and caution in research, which can at times reduce the need to use replication as a tool to verify the accuracy of research findings. Open Science Collaboration 2017 provides methodological advice to promote greater transparency in research practices, data collection, and in the analysis of data. The researchers in Schweinsberg, et al. 2016 initiated a program to run large-scale, independent replications of unpublished studies, and recommendations like these, together with the separation of exploratory and confirmatory practices as suggested in Van’t Veer and Giner-Sorolla 2016 and Wagenmakers, et al. 2012, should improve precision of research and reduce “HARKing” (Hypothesizing After Results are Known; Kerr 1998). Such improved precision was further bolstered by the provision of tools to analyze the evidential value in papers, such as the “p-curve” introduced in Simonsohn, et al. 2013. A contributing factor to imprecision may have been adoption of certain novel technological aids for research (e.g., the “click-and-play” options in SPSS). Yet more novel technological aids for research now have started reducing these problems, as they allow the sharing of data and code (e.g., through the Open Science Framework, or OSF), more intensive collaborations (e.g., integrating the OSF with other programs like GitHub and Dropbox), the preregistration of studies (e.g., through the OSF, or AsPredicted is a simpler website to preregister studies), or the better archiving of the research cycle (through programs like R Markdown to write manuscripts or R to analyze research data). The Center for Open Science has also promoted such initiatives through, for example, the Preregistration Challenge to further motivate researchers to make use of these tools, while the Society for the Improvement of Psychological Science supports and trains researchers in bettering their research practices. Stangor and Lemay 2016 explains how these gradual steps should ultimately better separate exploratory and confirmatory research practices, increase the reporting of null results, and likely lead to decrease (the unknown portion of) false positives in the literature.

                                          • AsPredicted.

                                            Save Citation »Export Citation »E-mail Citation »

                                            This website offers a simple format to preregister a study.

                                            Find this resource:

                                            • Kerr, N. L. 1998. HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review 2.3: 196–217.

                                              DOI: 10.1207/s15327957pspr0203_4Save Citation »Export Citation »E-mail Citation »

                                              One questionable research practice is to present post-hoc hypotheses as if they were predicted in advance. This article names these practices HARKing (hypothesizing after the results are known) and identifies several such several practices and advises against them. Available online.

                                              Find this resource:

                                              • Open Science Collaboration. 2017. Maximizing the reproducibility of your research. In Psychological science under scrutiny: Recent challenges and proposed solutions. Edited by S. O. Lilienfeld and I. D. Waldman, 1–26. Chichester, UK: John Wiley.

                                                DOI: 10.1002/9781119095910.ch1Save Citation »Export Citation »E-mail Citation »

                                                This chapter provides concrete recommendations in order to increase reproducibility in the social sciences. These propositions include power analysis, analysis planning, data collection stopping rules, preregistration, open materials, data and analyses scripts, and direct and extended replications. It also provides advice concerning diffusion of results.

                                                Find this resource:

                                                • Open Science Framework.

                                                  Save Citation »Export Citation »E-mail Citation »

                                                  This online resource allows scholars to share files (data, scripts, articles in progress, etc.) in order to improve collaborative research, discoverability of findings, and transparency in the scientific process. It also supports preregistration of studies.

                                                  Find this resource:

                                                  • Preregistration Challenge.

                                                    Save Citation »Export Citation »E-mail Citation »

                                                    This preregistration website is supported by the Center for Open Science, with the goal of incentivizing preregistration. Researchers who follow the procedure to preregister a study and have it accepted at one of the approved journals are rewarded with $1,000.

                                                    Find this resource:

                                                    • Schweinsberg, M., N. Madan, M. Vianello, et al. 2016. The pipeline project: Pre-Publication Independent Replications of a single laboratory’s research pipeline. Journal of Experimental Social Psychology 66:55–67.

                                                      DOI: 10.1016/j.jesp.2015.10.001Save Citation »Export Citation »E-mail Citation »

                                                      This article introduces the Pre-Publication Independent Replication approach, in which a lab collaboratively replicates their studies with other labs prior to publishing the effects. The approach is demonstrated by replicating ten unpublished projects in the moral judgment topic that the first author had “in the pipeline.” Of these, six effects replicated robustly.

                                                      Find this resource:

                                                      • Simonsohn, U., L. D. Nelson, and J. P. Simmons. 2013. P-curve: A key to the file drawer. Journal of Experimental Psychology: General 143.2: 534–547.

                                                        DOI: 10.1037/a0033242Save Citation »Export Citation »E-mail Citation »

                                                        This article introduces a statistical tool aimed at identifying if a paper contains evidential value even in the face of publication bias (the selective reporting of significant results). The method is based on the distribution of reported results between p = 0 and p = .05. In short, assuming there is a true underlying effect, one should expect more low p-values (e.g., p = .01) than relatively large ones (e.g., p = .04). Available online.

                                                        Find this resource:

                                                        • Stangor, C., and E. P. Lemay, eds. 2016. Introduction to the special issue on methodological rigor and replicability. In Special issue: Methodological rigor and replicability. Journal of Experimental Social Psychology 66:1–3.

                                                          DOI: 10.1016/j.jesp.2016.02.006Save Citation »Export Citation »E-mail Citation »

                                                          This short article introduces a special issue in the Journal of Experimental Social Psychology devoted to best research practices in psychology. It provides a quick overview of the community reaction to the crisis of confidence and of the solutions proposed to address this crisis. Available online.

                                                          Find this resource:

                                                          • van’t Veer, A. E., and R. Giner-Sorolla. 2016. Pre-registration in social psychology—a discussion and suggested template. Journal of Experimental Social Psychology 67:2–12.

                                                            DOI: 10.1016/j.jesp.2016.03.004Save Citation »Export Citation »E-mail Citation »

                                                            This article proposes two ways to use preregistration. Reviewed preregistrations (sometimes called “Registered Reports” elsewhere) imply a reviewing process based on theoretical and methodological quality before data collection; articles are thus published, whatever the results. Unreviewed preregistrations imply the standard reviewing process, but provide assurance that the hypotheses, methods, and analyses were specified before data collection. Available online.

                                                            Find this resource:

                                                            • Wagenmakers, E. J., R. Wetzels, D. Borsboom, H. L. J. van der Maas, and R. A. Kievit. 2012. An agenda for purely confirmatory research. Perspectives on Psychological Science 7.6: 632–638.

                                                              DOI: 10.1177/1745691612463078Save Citation »Export Citation »E-mail Citation »

                                                              The authors argue that researcher degrees of freedom in data analysis may substantially increase false positive publications. They thus encourage preregistration of the analysis plan before running the study in order to guarantee the confirmatory nature of the analysis.

                                                              Find this resource:

                                                              Recommendations for Best Replication Practices and Replication Initiatives in Social Psychology

                                                              Replication research evolved particularly rapidly in the field of social psychology between 2010 and 2017. As summarized in Asendorpf, et al. 2013, initiatives have promoted the use of replication in research and teaching, while the Open Science Framework: Registration Forms incorporated preregistration templates from the Replication Recipe to promote more informative replications. Debates regarding replication have gradually improved the quality of replications, where researchers, like those involved in Open Science Collaboration, advocate for large-scale, collaborative projects to assess replicability. Simons 2014 advocates for distributed replications across labs and samples over single-laboratory replications as they allow greater precision in results and confidence in findings across contexts. Ebersole, et al. 2016 also assesses heterogeneity between labs and across context (in this case, when in the semester a study is administered), further allowing inference about the degree of contextual variation; some of the researchers’ conclusions were, however, challenged in IJzerman, et al. 2017. To create new habits, the researchers in Grahe, et al. 2017 developed an education project to promote high-quality replications during undergraduate research. Some of the top journals now not only devote special issues to replications, like the one in Social Psychology in 2014, but also they have dedicated sections to publish replications like Simons 2014, promoted in Simons, et al. 2014. A frequently updated list accepting registered reports can be found at the website of the Center for Open Science.

                                                              Controversies and Limitations

                                                              Although the field has made considerable progress, controversies remain. Schmidt and Oh 2016 refutes the claim that replications are lacking in psychology because meta-analyses already exist and are based on (conceptual) replications. These researchers also suggest that meta-analyses are a more proper way to achieve cumulative knowledge. However, van Elk, et al. 2015 argues that meta-analysis is no substitute for replication because current methods for correcting for publication bias are insufficient. Coyne 2016 claims that the attention should be mostly directed to better research practices over replication. Moreover, Iso-ahola 2017 challenges the concept of falsifiability in social science and argues that logical and theoretical support can provide evidence for a phenomenon as well as empirical support. Baumeister 2016 also criticizes replications because it would slow down discovery and creativity. Strack 2017 points out that non-replications have to be integrated into the scientific debate in order to improve the understanding of boundary conditions of psychological phenomena. Yet Gilbert, et al. 2016 refutes the claim that the reproducibility rate of the Reproducibility Project: Psychology was only 36 percent, instead placing it at 85 percent. Some successful replications end up unpublished, suggestively because successful replications are less likely to be published, like for instance the successful replication attempt Lucas and Donnellan 2014.

                                                              Early Collaborative Replication Attempts

                                                              To answer the confidence crisis in psychology, in the early 2010s, scholars lead large-scale replication attempts in order to establish more accurately the replication rate in psychology. The aim of this section is to present the diversity of these early attempts, as well as to present the tools proposed by scholars to inventory replication attempts. Most of these replication projects aim to unite scholars around replications of published findings. The Many Labs Project and the Reproducibility Project: Psychology concerned large-scale replication attempts. The Registered Replication Reports concerns preregistered large-scale replication attempts that can be submitted to Perspectives on Psychological Science. Some innovations came later with the Collaborative Research and Educational Project, which concerns graduate students exclusively; the Pipeline Project, which concerns the replication of results before the publication of the original study; or the project untitled Many Analysts, One Dataset about the analysis of the same database by independent teams. Scholars have developed online tools to surface and aggregate information from replications (published or unpublished), such as Curate Science and Psych File Drawer.

                                                              back to top

                                                              Article

                                                              Up

                                                              Down