# Bayesian StatisticsbyAlexander LoPilato, Mo WangLAST REVIEWED: 14 December 2022LAST MODIFIED: 27 July 2016DOI: 10.1093/obo/9780199846740-0102

## Introduction

Bayesian statistics refer to a general method of estimating statistical models. In contrast to classic or frequentist statistics, Bayesian statistics, also referred to as Bayesian methods, view the population parameter as a random variable instead of a fixed value. Bayesian methods are used to combine the information obtained from the observed data and the specified statistical model—in the form of the likelihood function—with the researchers’ prior beliefs about the effects under investigation (in the form of the prior distribution) to estimate a posterior distribution for each effect. The posterior distribution is a probability distribution that describes the uncertainty surrounding an effect. Because the posterior distribution is a combination of the likelihood function and the prior distribution, it can be changed by obtaining more data or changing the prior distribution to reflect different degrees of certainty about the effect. However, as the sample size increases, the results obtained from Bayesian methods converge to those obtained from frequentist methods. Bayesian methods are named after Reverend Thomas Bayes, who derived the Bayesian theorem. Using conditional probabilities, this theorem equates a model’s posterior distribution, which is a probability distribution for the model parameters conditional on the observed data, to the product of its likelihood function, which is informed only by the observed data, and the prior distribution, which is informed by a researcher’s prior beliefs about possible parameter values. The product is then rescaled or normalized by dividing it by the marginal probability of the observed data. Given this mathematical formulation, we can appreciate how the posterior distribution can be changed either by obtaining more data or by changing the prior distribution to reflect different degrees of certainty about the effect. However, as more data are collected and sample sizes increase, the “likelihood swamps the prior,” and results obtained from Bayesian methods converge to those obtained from frequentist methods. The advantage of Bayesian methods remains, however, because one still gets use language referring to the most probable parameters given the data.

## Overviews of Research Methods

Like frequentist methods, no text on Bayesian methods can cover all of the relevant statistical theories or the variety of modeling applications. However, there are many books that offer a good overview of Bayesian methods but vary in their accessibility. Perhaps the most well-known and most-used Bayesian text is Bayesian Data Analysis (Gelman, et al. 2013). This book offers a technical introduction to Bayesian methods that moves from the fundamentals of Bayesian inference all the way to fitting nonlinear and nonparametric Bayesian models. Cowles 2013, Gill 2014, as well as Song and Lee 2012 all offer strong, technical introductions to Bayesian methods. Cowles 2013 places more of an emphasis on modeling single and multiparameter distributions. Gill 2014 is written for social scientists and as such focuses on Bayesian hypothesis testing, estimating general linear models, and generalized hierarchical linear models. Song and Lee 2012 focuses on estimating basic and advanced structural equation models with Bayesian methods. Gelman and Hill 2007, Jackman 2009, Kaplan 2014, Kéry 2010, and Kruschke 2015 all offer introductions that are more accessible to non-quantitative social scientists. Gelman and Hill 2007 focuses on estimating Bayesian multilevel (hierarchical) linear models. Jackman 2009 offers a balanced view of statistical models commonly used by social scientists such as general linear models, multilevel models, and latent variable models. Similar to Jackman 2009, Kaplan 2014 focuses on statistical models commonly used in the social sciences but places more of an emphasis on the practical application of Bayesian models, not the theory behind them. Written for ecologists, Kéry 2010 offers an accessible introduction to Bayesian general and generalized linear models as well as their multilevel counterparts. Kruschke 2015 provides perhaps the most accessible and thorough introduction to Bayesian methods. Kruschke 2015 covers much of the same material as Kéry 2010 but in more mathematical detail. The BUGS Book (Lunn, et al. 2013) can function as an introduction to Bayesian methods, but it is perhaps best used as an introduction to the Bayesian estimation program, WinBUGS.

• Cowles, M. K. Applied Bayesian Statistics with R and OpenBUGS Examples. New York: Springer, 2013.

Provides a concise introduction to Bayesian statistics. It focuses on building the mathematical and computational foundations needed to estimate single and multiparameter probability models.

Find this resource:

• Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian Data Analysis. 3d ed. New York: Chapman and Hall, 2013.

This is the authoritative text on Bayesian statistics. This book is broken into five different parts that cover the following: (1) the fundamentals of Bayesian inference, (2) the fundamentals of Bayesian data analysis, (3) advanced computation for Bayesian data analysis, (4) Bayesian regression models, and (5) Bayesian nonlinear and nonparametric models.

Find this resource:

• Gelman, A., and J. Hill. Data Analysis Using Regression and Multilevel/Hierarchical Models. New York: Cambridge University Press, 2007.

Although the main purpose of this book is to introduce general and generalized linear models, as well as general and generalized linear mixed-effects models, it provides an introduction to how Bayesian methods can be used to estimate those models.

Find this resource:

• Gill, J. Bayesian Methods: A Social and Behavioral Sciences Approach. 3d ed. Boca Raton, FL: Taylor and Francis, 2014.

Provides a technical introduction to Bayesian methods. It focuses heavily on the statistical foundations of Bayesian methods with several chapters on Markov Chain Monte Carlo theory/method.

Find this resource:

• Jackman, S. Bayesian Analysis for the Social Sciences. Chichester, UK: Wiley, 2009.

This book provides a nice balance between Bayesian statistical theory and application. It devotes an entire chapter to Bayesian measurement theory and covers Bayesian factor analysis models and item response models.

Find this resource:

• Kaplan, D. Bayesian Statistics for the Social Sciences. New York: Guilford, 2014.

Provides a comprehensive and accessible introduction to Bayesian methods. Its strength, however, is its coverage of Bayesian structural equation models including latent growth models, mixture models, and multilevel latent variable models.

Find this resource:

• Kéry, M. Introduction to WinBUGS for Ecologists: A Bayesian Approach to Regression, ANOVA, Mixed Models, and Related Analyses. San Diego, CA: Elsevier, 2010.

Provides a relatively non-technical introduction to Bayesian methods. It briefly touches on the statistical foundations of Bayesian methods but then quickly moves on to its applications. One of the strengths of this book is that each chapter introduces a different statistical model and then compares the frequentist estimated model to the Bayesian estimated model.

Find this resource:

• Kruschke, J. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2d ed. San Diego, CA: Elsevier, 2015.

Provides a nice mixture of Bayesian statistical theory and applications. It is a well-balanced, non-technical introduction to Bayesian methods. It is also one of the few texts to provide several chapters on Bayesian hypothesis testing and statistical power.

Find this resource:

• Lunn, D., C. Jackson, N. Best, A. Thomas, and D. Spiegelhalter. The BUGS Book: A Practical Introduction to Bayesian Analysis. Boca Raton, FL: Taylor and Francis, 2013.

Although concise, this text provides a comprehensive overview of Bayesian methods. It was written by the researchers who created the widely used Bayesian analysis software: WinBUGS. It contains a useful list of WinBUGS commands and can function as a WinBUGS user guide.

Find this resource:

• Song, X. -Y., and S. -Y. Lee. Basic and Advanced Bayesian Structural Equation Modeling with Applications in the Medical and Behavioral Sciences. Chichester, UK: Wiley, 2012.

A comprehensive overview of Bayesian structural equation modeling. Although technical, this book covers linear structural equation models (SEMs), nonlinear SEMs, multilevel SEMs, mixture SEMs, latent curve models, longitudinal SEMs, semi-parametric SEMs, and non-parametric SEMs.

Find this resource:

## Journals

Journals offer the most up-to-date and innovative developments in Bayesian methods. With grounding in Bayesian methods, researchers should turn to leading methodological and empirical journals to gain a deeper knowledge. The leading methodological journals that publish research on Bayesian methods are Bayesian Analysis, Multivariate Behavioral Research, Psychological Methods, Organizational Research Methods, and Structural Equation Modeling. The leading empirical management journals are Journal of Applied Psychology and Journal of Management. Because Bayesian methods are relatively new to the management discipline, only a few empirical journals have published research using Bayesian methods.

• Published quarterly by the International Society for Bayesian Analysis, Bayesian Analysis is a multidisciplinary journal that publishes statistical and computational articles on Bayesian analysis.

Find this resource:

• Published bimonthly by the American Psychological Association, the Journal of Applied Psychology publishes theoretical and empirical articles usually from the fields of management and industrial-organizational psychology. The following link directs the reader to publications in the Journal of Applied Psychology on Bayesian methods online.

Find this resource:

• Published bimonthly by SAGE, the Journal of Management publishes theoretical, empirical, and methodological articles form the fields of management and industrial-organizational psychology. The following link directs the reader to publications in the Journal of Management on Bayesian methods online.

Find this resource:

• Published bimonthly by Routledge, Multivariate Behavioral Research publishes substantive, theoretical, and methodological articles from the behavioral and social sciences disciplines. The following link directs the reader to publications in Multivariate Behavioral Research on Bayesian methods online.

Find this resource:

• Published quarterly by SAGE, Organizational Research Methods publishes novel methodological work and methodological tutorials relevant to the organizational and management disciplines. The following link directs the reader to publications in Organizational Research Methods on Bayesian methods online.

Find this resource:

• Published quarterly by the American Psychological Association, Psychological Methods is a leading methodological journal that publishes innovative articles on social scientific methodology, measurement, and research design. The following link directs the reader to publications in Psychological Methods on Bayesian methods online.

Find this resource:

• Published quarterly by Taylor and Francis, Structural Equation Modeling is a multidisciplinary journal that publishes methodological work broadly focused on structural equation modeling. The following link directs the reader to publications in Structural Equation Modeling on Bayesian methods online.

Find this resource:

## Conceptual Issues

Since the 1920s frequentist methods have been largely favored over Bayesian methods for philosophical and statistical reasons. The focus of this section is on the philosophical reasons. Scientists, social or otherwise, place great value on the objective nature of their research endeavors and, as described in Efron 1986, have gone to great lengths to ensure that the statistical methods they use guard their results from their own subjective biases. However, as Berger and Berry 1988 argues, frequentist statistical methods are not objective; they just keep their subjectivity hidden. In contrast, Bayesian methods embrace their subjectivity by allowing researchers to influence the results of their statistical models through their specification of the prior distribution. For example, Zyphur and Oswald 2015 notes that if a researcher is confident in his or her prior beliefs then that researcher can build this confidence directly into the Bayesian model by shrinking the variance of their prior distribution, thus increasing its influence over the posterior distribution. Indeed, despite the strong arguments made in Galavotti 2015, Gill 1999, and Stone 2013 that the managerial and larger social sciences should embrace the idea of subjective probabilities offered by Bayesian methods, subjectivity is still one of the main critiques against Bayesian methods as noted in Gelman 2008. Gelman and Shalizi 2013, however, argues that researchers can estimate Bayesian models without subscribing to the Bayesian philosophy of subjective probabilities. They proposed that researchers should treat their prior distributions as another aspect of the statistical model that can be properly specified or incorrectly specified. In a similar vein, Kaplan 2014 proposes that researchers build priors by using frequentist methods to analyze older data sets that contain the same variables as the current data set. They can use the results of that analysis to inform the specification of the prior distribution. Additionally, Berger 2006 makes a strong case for objective Bayesian methods. Despite this, however, it is important for researchers to remember that Bayesian methods are not a methodological panacea. As noted in Gigerenzer and Marewski 2015, some research scenarios may require Bayesian methods, whereas others may require frequentist or non-parametric statistical methods.

• Berger, J. “The Case for Objective Bayesian Analysis.” Bayesian Analysis 1 (2006): 385–402.

An introduction and overview of objective Bayesian analysis.

Find this resource:

• Berger, J. O., and D. A. Berry. “Statistical Analysis and the Illusion of Objectivity.” American Scientist 76 (1988): 159–165.

Argues that, in general, the practice of statistics—frequentist and Bayesian—is subjective. The strength of Bayesian statistics is that it acknowledges this subjectivity and opens it up to criticism from the scientific community.

Find this resource:

• Efron, B. “Why Isn’t Everyone a Bayesian?” American Statistician 40 (1986): 1–5.

Briefly reviews the two chief competitors to Bayesian statistics—Fisherian theory and the Neyman-Pearson-Wald school of decision theory—and outlines reasons why they are preferred over Bayesian theory.

Find this resource:

• Galavotti, M. C. “Probability Theories and Organization Science: The Nature and Usefulness of Different Ways of Treating Uncertainty.” Journal of Management 41 (2015): 744–760.

This paper reviews different probability theories and their interpretations. It finishes with a recommendation that organizational research should take on a subjective interpretation of probability.

Find this resource:

• Gelman, A. “Objections to Bayesian Statistics.” Bayesian Analysis 3 (2008): 445–450.

Written by a Bayesian statistician, this article outlines several arguments against Bayesian statistics that researchers should be aware of.

Find this resource:

• Gelman, A., and C. R. Shalizi. “Philosophy and Practice of Bayesian Statistics.” British Journal of Mathematical and Statistical Psychology 66 (2013): 8–38.

Argues against conflating Bayesian philosophy and Bayesian methods. Its main focus is on ways researchers can test and subsequently falsify Bayesian statistical models.

Find this resource:

• Gigerenzer, G., and J. N. Marewski. “Surrogate Science: The Idol of a Universal Method for Scientific Inference.” Journal of Management 41 (2015): 421–440.

This is a theoretical paper about the dangers of mechanically applying statistical methods—Bayesian or otherwise—to test theories in the social sciences.

Find this resource:

• Gill, J. “The Insignificance of Null Hypothesis Significance Testing.” Political Research Quarterly 52 (1999): 647–674.

A critique of null hypothesis significance testing that forwards Bayesian methods as a potential alternative to theory testing.

Find this resource:

• Kaplan, D. Bayesian Statistics for the Social Sciences. New York: Guilford, 2014.

The closing chapter of this text highlights theoretical differences between frequentist and Bayesian statistics. It then contrasts subjective Bayesian methods with objective Bayesian methods and ends with a discussion on evidence-based Bayesian methods.

Find this resource:

• Stone, J. V. Bayes’ Rule: A Tutorial Introduction to Bayesian Analysis. Sheffield, UK: Sebtel, 2013.

Provides a theoretical introduction to Bayesian statistics with little applied data analytic examples. The closing chapter focuses on the nature of probability and the distinction between frequentist and Bayesian statistics.

Find this resource:

• Zyphur, M. J., and F. L. Oswald. “Bayesian Estimation and Inference: A User’s Guide.” Journal of Management 41 (2015): 387–389.

A theoretical article that provides one of the first reviews of Bayesian methods written for organizational scholars.

Find this resource:

## Model Estimation

Until 21st century computational and statistical developments, it was not possible to estimate many Bayesian models. Estimation difficulties, along with philosophical debates, have slowed the spread of Bayesian methods. Indeed, the choice of a Bayesian prior used to be based on its convenience, not its theoretical relevance. These priors were called conjugate priors and ensured that the resulting posterior distribution would belong to the same distributional family as the prior. However, with increases in the availability of high-powered computers and the discovery of Markov Chain Monte Carlo methods, it is now possible for researchers to use numerical methods to estimate Bayesian models without specifying a conjugate prior.

### Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo (MCMC) methods are a general class of methods that allow researchers to simulate and summarize probability distributions. Gelman, et al. 2013; Geyer 1992; and Robert and Casella 2004 all provide useful introductions to MCMC methods. Monte Carlo methods—the second part of MCMC—cover a broad class of algorithms designed to numerically sample independent observations from a target probability distribution. After enough observations have been sampled from the distribution it is possible to use them to estimate various distributional parameters or parameter transformations. A detailed and accessible introduction to Monte Carlo methods role in Bayesian analysis can be found in Jackman 2004 and Jackman 2000, whereas a general introduction to Monte Carlo methods can be found in Braun and Murdoch 2007. Markov chains—the first part of the MCMC—describe a class of stochastic processes that explore a state space, such as a probability distribution, which is defined by a transition matrix. Their defining feature is that a Markov chain’s future position is only dependent on its current position. When combined, MCMC methods allow a researcher to estimate and evaluate complex multidimensional probability spaces, such as the posterior distribution, that would otherwise have been impossible to calculate. For a technical discussion of Markov chains the reader is directed to Gill 2014, and for a non-technical discussion the reader is directed to Gill and Witko 2013.

• Braun, W. J., and D. J. Murdoch. A First Course in Statistical Programming with R. New York: Cambridge University Press, 2007.

Offers an accessible introduction to the requisite statistical programming knowledge needed to understand how computational MCMC algorithms function.

Find this resource:

• Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian Data Analysis. 3d ed. New York: Chapman and Hall, 2013.

Over several chapters, this book provides an introduction to theoretical and computational MCMC methods. While useful, this book relies on mathematical formulae to convey the basics of MCMC methods, which is necessary for more advanced applications of Bayesian methods.

Find this resource:

• Geyer, C. J. “Practical Markov Chain Monte Carlo.” Statistical Science 7 (1992): 473–483.

Provides an overview of MCMC. Although technical, this paper succinctly summarizes estimation issues surrounding MCMC methods such as determining when the chains have reached a stationary distribution.

Find this resource:

• Gill, J. Bayesian Methods: A Social and Behavioral Sciences Approach. 3d ed. Boca Raton, FL: Taylor and Francis, 2014.

Through several chapters this book covers both basic and advanced topics on the theory of MCMC. It also provides a chapter that covers practical MCMC considerations such as making decisions about thinning a Markov chain, an appropriate burn-in period, and assessing Markov chain convergence.

Find this resource:

• Gill, J., and C. Witko. “Bayesian Analytical Methods: A Methodological Prescription for Public Administration.” Journal of Public Administration Research and Theory 23 (2013): 457–494.

Provides a general introduction to Bayesian methods for social scientists and includes a supplemental appendix on MCMC estimation.

Find this resource:

• Jackman, S. “Estimation and Inference via Bayesian Simulation: An Introduction to Markov Chain Monte Carlo.” American Journal of Political Science 44 (2000): 375–404.

Written for social scientists, this article provides an accessible introduction to MCMC methods and includes several modeling examples.

Find this resource:

• Jackman, S. “Bayesian Analysis for Political Research.” Annual Reviews of Political Science 7 (2004): 483–505.

Provides a general overview of Bayesian methods and uses one of its sections to provide a concise, useful summary of MCMC methods.

Find this resource:

• Metropolis, N., and S. Ulam. “The Monte Carlo Method.” Journal of the American Statistical Association 44 (1949): 335–341.

A seminal and classic article that introduced the Monte Carlo method.

Find this resource:

• Robert, C. P., and G. Casella. Monte Carlo Statistical Methods. 2d ed. New York: Springer, 2004.

Provides an in-depth, technical introduction to MCMC methods that covers both basic and advanced topics.

Find this resource:

### Sampling Algorithms

The crux of the Markov chain is the transition matrix. The transition matrix defines the possible states that a Markov chain could explore as well as the probability of the chain transitioning to a given state from its current state. In Bayesian statistics, Markov chains are used to explore the posterior distribution, which is often described by many different parameters such as two or more regression coefficients. To ensure that the Markov chain finds and stays in the posterior distribution the transition matrix needs to be specified using a sampling algorithm. The Gibbs sampler and the Metropolis-Hastings algorithm are two of the more common sampling algorithms used in Bayesian statistics. Hoffman and Gelman 2014, however, recently introduced a new sampler called the No-U-Turn sampler, which is gaining in popularity. Stated simply, the Gibbs sampler separates a marginal probability distribution into its simpler conditional distributions and then samples from those. That is, the Gibbs sampler uses the conditional distributions as the transition matrix. In contrast, the Metropolis-Hastings algorithm uses an acceptance-rejection sampling algorithm as the transition matrix, which allows it to sample directly from the posterior distribution. Geman and Geman 1984 is the first published work on the Gibbs sampler and is technical in nature. Casella and George 1992 offers a more accessible introduction to the Gibbs sampler. Following Casella and George 1992, Chib and Greenberg 1995 offers an accessible introduction to the Metropolis-Hastings algorithm.

• Casella, G., and E. I. George. “Explaining the Gibbs Sampler.” American Statistician 46 (1992): 167–174.

Provides an accessible introduction to the Gibbs sampler, which is one of the more common sampling algorithms used to estimate Bayesian statistical models.

Find this resource:

• Chib, S., and E. Greenberg. “Understanding the Metropolis-Hastings Algorithm.” American Statistician 49 (1995): 327–335.

Provides an introduction to the Metropolis-Hastings algorithm. The Metropolis-Hastings algorithm is another common sampling algorithm used to estimate Bayesian models and is more complex in comparison to the Gibbs sampler.

Find this resource:

• Geman, S., and D. Geman. “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images.” IEE Transactions on Pattern Analysis and Machine Intelligence 6 (1984): 721–741.

This seminal article introduced the Gibbs sampler. Although this article did not use the Gibbs sampler to estimate a Bayesian model, it served as a catalyst for Bayesian model estimation.

Find this resource:

• Hoffman, M. D., and A. Gelman. “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo.” Journal of Machine Learning Research 15 (2014): 1351–1381.

Introduces the No-U-Turn sampler. The No-U-Turn sampler is a relatively new sampling algorithm designed to more efficiently sample correlated model parameters.

Find this resource:

## Elements of the Posterior Distribution

Once a Bayesian model has been estimated, instead of a single point estimate for each model parameter, a researcher will have a posterior distribution for each parameter that reflects both the likelihood function of the observed data and a researcher’s prior beliefs about that. Because of this, it is important to have an understanding of the likelihood function, an understanding of the available prior distributions, and an understanding of how to summarize and report Bayesian model results. There are a variety of statistical texts on likelihood theory, but Myung 2003 provides a short, accessible overview of it. Just like its use in frequentist statistics, the likelihood function is a function of the parameters of a specified probability density (or mass) used to calculate the likelihood of obtaining the observed data given the specified parameters. That is, the likelihood function fixes the data and observes how the likelihood of that data changes by varying the parameters. The parameters that produce the largest likelihood are the maximum likelihood estimates. In contrast, the prior distribution does not need to rely on the observed data. The prior distribution is a probability distribution that reflects the belief a researcher has about the values a model parameter can take. As such, researchers can specify priors that restrict certain parameter values by setting the probability of observing such values to zero. Informative priors—priors that reflect a high degree of certainty about the value of a parameter—are often derived from previous findings or elicited from subject matter experts as discussed in Gill and Walker 2005. Alternatively, when a researcher does not want the posterior distribution to be overly influenced by the prior distribution they may choose to specify an uninformative prior that reflects a high degree of uncertainty. However, Gelman 2006 argues that for bounded parameters, such as variance parameters, researchers need to be careful with their choice of probability distribution when formulating uninformative priors. Finally, once a parameter’s posterior distribution has been estimated, it can be described by measures of central tendency (e.g., mean, median, and/or mode) as well as measures of dispersion as documented in Jackman 2000 and Kruschke, et al. 2012. And, as an additional benefit, a posterior distribution can be simultaneously estimated for any function of the model parameters as seen in Jackman 2000. For example, because the indirect effect from a statistical mediation model is a function of two (or more) regression coefficients, a posterior distribution can be estimated for the indirect effect.

• Gelman, A. “Prior Distributions for Variance Parameters in Hierarchical Models.” Bayesian Analysis 3 (2006): 515–533.

Reviewed several plausible prior distributions for variance components and introduced a new half-Cauchy prior parameterization.

Find this resource:

• Gill, J., and L. D. Walker. “Elicited Priors for Bayesian Model Specifications in Political Science Research.” Journal of Politics 67 (2005): 841–872.

This article introduces different methods for converting qualitative knowledge gained from a subject matter expert into various prior distributions.

Find this resource:

• Jackman, S. “Estimation and Inference are Missing Data Problems: Unifying Social Science Statistics via Bayesian Simulation.” Political Analysis 8 (2000): 307–332.

Specific sections of this article use real data examples to illustrate how researchers can summarize the Bayesian posterior distribution. It also provides a section comparing point estimates obtained from frequentist statistical models to posterior distribution estimates.

Find this resource:

• Kruschke, J. K., H. Aguinis, and H. Joo. “The Time Has Come: Bayesian Methods for Data Analysis in the Organizational Sciences.” Organizational Research Methods 15 (2012): 722–752.

Written as a general introduction to Bayesian methods for organizational psychologists, this article provides several accessible sections on specifying prior distributions and summarizing posterior distributions.

Find this resource:

• Myung, I. J. “Tutorial on Maximum Likelihood Estimation.” Journal of Mathematical Psychology 47 (2003): 90–100.

Provides an accessible introduction to the likelihood function. It was written as a tutorial for psychologists.

Find this resource:

## Bayesian Statistical Software

Despite computational and statistical advances, relatively user-friendly Bayesian statistical programs did not become freely available until the late 1990s with the release of WinBUGS. Now, management researchers can choose between several Bayesian statistical programs: OpenBUGS, JAGS, and Stan. However, like any statistical program, management researchers will have to learn the BUGS (OpenBUGS and JAGS) and/or Stan programming languages. Jackman 2009, Kaplan 2014, and Kruschke 2015 provide in-text JAGS code to go along with their Bayesian modeling examples. Kéry 2010 and Lunn, et al. 2013 provide in-text BUGS (compatible with OpenBUGS and its predecessor, WinBUGS) code to go along with their Bayesian modeling examples. Kruschke 2015 also provides in-text Stan code to go along with some of the Bayesian modeling examples.

• Jackman, S. Bayesian Analysis for the Social Sciences. Chichester, UK: Wiley, 2009.

Provides both in-text R and JAGS code for selected examples. The examples cover hierarchical linear models, generalized linear models, and measurement models. The book’s publisher also provides downloadable code and data for all of the in-text examples.

Find this resource:

• Kaplan, D. Bayesian Statistics for the Social Sciences. New York: Guilford, 2014.

Provides both in-text R and JAGS code for selected examples. The examples cover general and generalized linear models, hierarchical linear models, structural equation models with latent variables, and mixture models. The book publisher also provides downloadable code and data for all of the in-text examples.

Find this resource:

• Kéry, M. Introduction to WinBUGS for Ecologists: A Bayesian Approach to Regression, ANOVA, Mixed Models, and Related Analyses. San Diego, CA: Elsevier, 2010.

Although written for ecologists, this book provides accessible in-text examples with WinBUGS code. It also provides in-text R code to generate the data and subsequently fit a frequentist model to compare to the Bayesian model. The examples cover t-tests, general and generalized linear models, and general and generalized linear-mixed effects models.

Find this resource:

• Kruschke, J. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2d ed. San Diego, CA: Elsevier, 2015.

Written for psychologists, this book has relatable and accessible examples with in-text code. This is one of the few books that provides R, JAGS, and Stan code, and it also provides general and generalized linear model examples. All of the data and code in the book can be downloaded through Kruschke’s website.

Find this resource:

• Lunn, D., C. Jackson, N. Best, A. Thomas, and D. Speigelhalter. The BUGS Book: A Practical Introduction to Bayesian Analysis. Boca Raton, FL: Taylor and Francis, 2013.

Written by the creators of WinBUGS, this book provides applied statistical examples with in-text WinBUGS code. The examples span general and generalized linear models, hierarchical models, and more specialized models (e.g., spatial models, time series models, and spline models). This book also provides several useful appendices that detail the BUGS syntax, functions, and probability distributions.

Find this resource:

## Methods

One of the advantages that Bayesian methods have over frequentist methods is that any statistical model that can be estimated using frequentist methods can also be estimated by Bayesian methods; however, frequentist methods cannot estimate every model that Bayesian methods can. In this section, we focus on several statistical models commonly used by management researchers: mean comparisons, general and generalized linear models, general and generalized linear mixed-effects models, as well as general and generalized latent variable models. We provide citations that utilize their Bayesian counterparts.

### Bayesian Mean Comparisons

Mean comparisons can be made between two independent or dependent groups in the form of an independent or dependent samples t-test, respectively as discussed in Kruschke 2013. Comparisons can also be extended to three or more groups using an analysis of variance as discussed in Jackman 2009 or a hierarchical linear model with no predictors as seen in Gelman, et al. 2012. When estimating a Bayesian mean comparison it is necessary to specify prior distributions for the group deviations from the grand mean, the within-group variance, and the between-group variance. For a two-group comparison, Kruschke 2013 shows that as the posterior distributions for the separate group means are being estimated, it is possible to subtract their simulated values to create a posterior distribution for the difference between the two group means. Gelman, et al. 2012 and Jackman 2009 use a hierarchical model to estimate Bayesian analysis of variance models.

• Gelman, A., J. Hill, and M. Yajima. “Why We (Usually) Don’t Have to Worry About Multiple Comparisons.” Journal of Research on Educational Effectiveness 5 (2012): 189–211.

This article argues that, when possible, empirical Bayes and fully Bayesian hierarchical models should be used to make multiple comparisons. Through partial pooling, both empirical Bayes and fully Bayesian hierarchical models pull group estimates toward their grand mean, thus providing a multiple comparison correction without sacrificing statistical power.

Find this resource:

• Jackman, S. Bayesian Analysis for the Social Sciences. Chichester, UK: Wiley, 2009.

Provides a detailed section on how to estimate a one-way analysis of variance as a Bayesian hierarchical model.

Find this resource:

• Kruschke, J. K. “Bayesian Estimation Supersedes the T Test.” Journal of Experimental Psychology: General 142 (2013): 573–603.

This is a detailed article that provides a general introduction to Bayesian t-tests and compares their performance to frequentist t-tests.

Find this resource:

### Bayesian General and Generalized Linear Models

The general linear model is heavily relied on by management researchers. It is used to estimate the effects that continuous and categorical predictors have on an outcome with a normal error distribution. Generalized linear models are used to estimate the effects that continuous and categorical predictors have on an outcome with a non-normal error distribution. These models encompass logistic regression, poisson regression, and a variety of other nonlinear regression models. Most textbooks on Bayesian methods contain chapters that demonstrate how to estimate Bayesian general and generalized linear models as explained in Gelman, et al. 2013; Gill 2014; Kaplan 2014; Kruschke 2015; and Kruschke, et al. 2012. Indeed, because most Bayesian statistical programs require the user to directly specify the likelihood function of their observed data (e.g., normal, binomial, or poisson) Bayesian models typically force researchers to think about error distribution of their data. Beyond general introductions, researchers have used Bayesian general linear models to answer questions about mediation effects, as the sampling distribution of the indirect effect is typically nonlinear. This is discussed in Enders, et al. 2014; Koopman, et al. 2015; Park and Kaplan 2015; and Wang and Preacher 2015. Researchers have compared the performance of Bayesian mediation models to frequentist mediation models under conditions of missing data as seen in Enders, et al. 2014, complex moderated-mediation models as documented in Wang and Preacher 2015), as well as small sample sizes according to Koopman, et al. 2015.

• Enders, C. K., A. J. Fairchild, and D. P. MacKinnon. “A Bayesian Approach for Estimating Mediation Effects with Missing Data.” Multivariate Behavioral Research 48 (2014): 340–369.

Compares the performance of Bayesian linear regression mediation models to frequentist linear regression mediation models in the presence of missing data. The researchers found that the Bayesian mediation model performed as well as a maximum likelihood mediation model with the bias-corrected bootstrap and better than normal-theory significance tests.

Find this resource:

• Gelman, A., J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian data analysis. 3d ed. New York: Chapman and Hall, 2013.

Through several chapters, this book covers the statistical theory behind Bayesian general and generalized linear models.

Find this resource:

• Gill, J. Bayesian Methods: A Social and Behavioral Sciences Approach. 3d ed. Boca Raton, FL: Taylor and Francis, 2014.

Contains a chapter on the statistical theory underlying the Bayesian general linear model.

Find this resource:

• Kaplan, D. Bayesian Statistics for the Social Sciences. New York: Guilford, 2014.

Contains a chapter on general and generalized linear models. Although it does touch on theory underlying both model classes, this book focuses more on their application.

Find this resource:

• Koopman, J., M. Howe, J. R. Hollenbeck, and H. -P. Sin. “Small Sample Mediation Testing: Misplaced Confidence in Bootstrapped Confidence Intervals.” Journal of Applied Psychology 100 (2015): 194–202.

Compares the Type I error rates and statistical power of several frequentist bootstrap procedures to Bayesian methods when estimating an indirect effect from small samples. The results showed that Bayesian methods performed as well as the permutation method and better than the bootstrap methods.

Find this resource:

• Kruschke, J. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan. 2d ed. San Diego, CA: Elsevier, 2015.

Contains several chapters that focus on using Bayesian methods to estimate simple and multiple linear regression models, analysis of variance models with multiple factors, logistic regression models, and ordinal regression models.

Find this resource:

• Kruschke, J. K., H. Aguinis, and H. Joo. “The Time Has Come: Bayesian Methods for Data Analysis in the Organizational Sciences.” Organizational Research Methods 15 (2012): 722–752.

Provides an example of Bayesian multiple regression and provides guidance on how to report its results.

Find this resource:

• Park, S., and D. Kaplan. “Bayesian Causal Mediation Analysis for Group Randomized Designs with Homogenous and Heterogeneous Effects: Simulation and Case Study.” Multivariate Behavioral Research Methods 50 (2015): 316–333.

Applies Bayesian methods to single-level causal mediation analysis and multilevel causal mediation analysis.

Find this resource:

• Wang, L., and K. J. Preacher. “Moderated Mediation Analysis Using Bayesian Methods.” Structural Equation Modeling: A Multidisciplinary Journal 22 (2015): 249–263.

Compares Bayesian moderated mediation models to frequentist moderation models. The research found that Bayesian moderated mediation models had higher power than frequentist models that used either parametric standard errors or bootstrap percentile confidence intervals to test estimated effects. Bayesian and frequentist models that used bias corrected bootstrap confidence intervals displayed similar power.

Find this resource:

• Yuan, Y., and D. P. MacKinnon. “Bayesian Mediation Analysis.” Psychological Methods 14 (2009): 301–322.

Compares Bayesian estimated simple mediation models to frequentist estimated simple mediation models. It also proposes a Bayesian multilevel mediation model.

Find this resource:

### Bayesian General and Generalized Linear Mixed-Effects Models

General and generalized linear mixed-effects models are a class of statistical models used to model the effects of continuous and categorical predictors on an outcome variable with a complex error structure. Complex error structures can occur when outcomes are nested within a higher-level cluster (e.g., Park, et al. 2004 analyzes data where individuals are nested within American states), crossed with another cluster such as a set of employees being rated by a set of raters (e.g., random-effects or variance components model as discussed in LoPilato, et al. 2015), or crossed and nested in multiple clusters. These structures often create heterogeneous and correlated error variances, which cannot be handled by general or generalized linear models. When these models are estimated using a frequentist method, the researcher is forced to make a complicated distinction between fixed (non-random) and random effects. Because Bayesian models do not force this distinction—as every effect is considered a random effect—they are typically preferred over their frequentist counterparts as documented in Gelman and Hill 2007. Most Bayesian research on linear mixed-effects models has been focused on determining the effects that small sample sizes for both the outcome variable and the clusters (e.g., level-two units in multilevel modeling) have on estimation as seen in Baldwin and Fellingham 2013; LoPilato, et al. 2015; and Stegmueller 2013. Estimating Bayesian general and generalized linear mixed-effects models can be complicated. Kéry 2010, however, provides an accessible introduction to their estimation.

• Baldwin, S. A., and G. W. Fellingham. “Bayesian Methods for the Analysis of Small Sample Multilevel Data with a Complex Variance Structure.” Psychological Methods 18 (2013): 151–164.

In this article, the researchers compare the performance of Bayesian multilevel models to frequentist multilevel models at small sample sizes. They found that Bayesian estimated variance components are more biased but more efficient than frequentist estimated variance components.

Find this resource:

• Gelman, A., and J. Hill. Data Analysis Using Regression and Multilevel/Hierarchical Models. New York: Cambridge University Press, 2007.

Written for social scientists, the latter part of this book is devoted to estimating multilevel models with frequentist and Bayesian methods. It includes a chapter on Bayesian multilevel general and generalized linear models.

Find this resource:

• Jackman, S. Bayesian Analysis for the Social Sciences. Chichester, UK: Wiley, 2009.

Contains a chapter that covers the statistical theory underlying Bayesian linear and logistic multilevel models.

Find this resource:

• Kéry, M. Introduction to WinBUGS for Ecologists: A Bayesian Approach to Regression, ANOVA, Mixed Models, and Related Analyses. San Diego, CA: Elsevier, 2010.

This book provides separate, accessible chapters on the linear mixed-effects model, poisson mixed-effects model, and binomial mixed-effects model. Each chapter demonstrates how to estimate the mixed-effect model using Bayesian and frequentist methods.

Find this resource:

• LoPilato, A. C., N. T. Carter, and M. Wang. “Updating Generalizability Theory in Management Research: Bayesian Estimation of Variance Components.” Journal of Management 41 (2015): 692–717.

In this article, the researchers build on generalizability theory by showing how variance components can be estimated by Bayesian methods. They compared the performance of Bayesian variance component models to frequentist variance components models. They found that the Bayesian variance component model provided the least biased variance estimates when an informative prior was used.

Find this resource:

• Park, D. J., A. Gelman, and J. Bafumi. “Bayesian Multilevel Estimation with Poststratification: State-level Estimates from National Polls.” Political Analysis 12 (2004): 375–385.

This article uses a Bayesian logistic hierarchical model to predict voting behavior.

Find this resource:

• Stegmueller, D. “How Many Countries for Multilevel Modeling? A Comparison of Frequentist and Bayesian Approaches.” American Journal of Political Science 57 (2013): 748–761.

Examines the effect that varying the level-two sample sizes has on Bayesian and frequentist estimated linear and probit multilevel models. The results show that the Bayesian models outperform the frequentist models when the level-two sample size is small.

Find this resource:

## Bayesian General and Generalized Latent Variable Models

General and generalized latent variable models have been widely adopted in management research. Similar to the distinction between general and generalized linear models, general latent variable models are estimated when both the manifest variables (e.g., survey responses) and latent (unobserved) variables have normal error distributions. If either one of these error distributions cannot be considered normal, then a generalized latent variable model such as an item response model must be estimated. Bayesian methods are well suited to estimate general and generalized latent variable models, as latent variables can be considered as missing data that needs to be predicted and does not affect the MCMC estimation as explained in Jackman 2009, Song and Lee 2012b, and Song and Lee 2012a. Indeed, it is relatively straightforward to estimate Bayesian structural equation models as discussed in Muthén and Asparouhov 2012 and Stromeyer, et al. 2015 and latent growth models as seen in Kaplan 2014 and Kaplan and Depaoli 2012. Moreover, because Bayesian methods can handle non-normal manifest and latent error distributions, it is possible to estimate Bayesian item response models according to Curtis 2010 and Fox 2010, mixture models as documented in Depaoli 2013, factor models with different manifest item error distributions as explained in Quinn 2004, and latent variable interactions as discussed in Harring, et al. 2012. However, the onus is on the modeler to choose adequate prior distributions for the latent variable covariance matrix according to Song and Lee 2012b and factor loadings as seen in Muthén and Asparouhov 2012 and Stromeyer, et al. 2015.

• Curtis, S. M. “BUGS Code for Item Response Theory.” Journal of Statistical Software 36 (2010): 1–34.

This article demonstrates how to estimate the two-parameter logistic model, three-parameter logistic model, graded response model, and testlet models using Bayesian methods.

Find this resource:

• Depaoli, S. “Mixture Class Recovery in GMM Under Varying Degrees of Class Separation: Frequentist versus Bayesian Estimation.” Psychological Methods 18 (2013): 186–219.

In this article, the researcher compares Bayesian growth mixture models to frequentist growth mixture models. The results show that when using an informative prior, Bayesian growth mixture models can more accurately recover growth trajectories and latent class proportions under varying levels of class separation.

Find this resource:

• Fox, J. -P. Bayesian Item Response Models. New York: Springer, 2010.

As an advanced text on Bayesian item response model, this book covers basic and advanced item response models. It provides a mathematically rigorous introduction to a variety of Bayesian item response models without any in-text code. However, the code can be obtained from the author’s website.

Find this resource:

• Harring, J. R., B. A. Weiss, and J. -C. Hsu. “A Comparison of Methods for Estimating Quadratic Effects in Nonlinear Structural Equation Models.” Psychological Methods 17 (2012): 193–214.

Compares Bayesian estimation of latent interactions to a variety of frequentist estimators. The results show that Bayesian estimates are comparable to frequentist estimates across different simulation factors when an uninformative prior is used.

Find this resource:

• Jackman, S. Bayesian Analysis for the Social Sciences. Chichester, UK: Wiley, 2009.

This book provides a chapter on how to estimate factor analytic and item response models using Bayesian methods.

Find this resource:

• Kaplan, D. Bayesian Statistics for the Social Sciences. New York: Guilford, 2014.

Provides an accessible chapter on how to estimate Bayesian confirmatory factor analytic models, single- and multilevel structural equation models, and mixture models.

Find this resource:

• Kaplan, D., and S. Depaoli. “Bayesian Structural Equation Modeling.” In Handbook of Structural Equation Modeling. Edited by R. Hoyle, 407–437. New York: Guilford, 2012.

This article is a general introduction to Bayesian structural equation models. Through three different examples, it discusses which priors should be used for structural equation models and how different models should be compared.

Find this resource:

• Muthén, B., and T. Asparouhov. “Bayesian Structural Equation Modeling: A More Flexible Representation of Substantive Theory.” Psychological Methods 17 (2012): 313–335.

This article can serve as an introduction to estimating structural equation models using Bayesian methods. Its main purpose, however, is to argue in favor of placing informative priors centered at zero on factor loadings instead of constraining the factor loadings to zero.

Find this resource:

• Quinn, K. M. “Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses.” Political Analysis 12 (2004): 338–353.

This article demonstrates how to estimate a factor model from a combination of ordinal and continuous items using Bayesian methods.

Find this resource:

• Song, X. -Y., and S. -Y. Lee. Basic and Advanced Bayesian Structural Equation Modeling with Applications in the Medical and Behavioral Sciences. Chichester, UK: Wiley, 2012a.

Although technical, this book provides an in-depth introduction to Bayesian structural equation models. It begins with single-level parametric structural equation models and then works through multilevel models, longitudinal models, models with missing data, and non-parametric models. It also provides code for some of the examples at the end of each chapter.

Find this resource:

• Song, X. -Y., and S. -Y. Lee. “A Tutorial on the Bayesian Approach for Analyzing Structural Equation Models.” Journal of Mathematical Psychology 56 (2012b): 135–148.

Written as an introduction for psychologists, this article covers the basics of estimating structural equation models using Bayesian methods. Although somewhat technical, this study provides WinBUGS code to go along with its example.

Find this resource:

• Stromeyer, W. R., J. W. Miller, R. Sriramachandramurthy, and R. DeMartino. “The Prowess and Pitfalls of Bayesian Structural Equation Modeling: Important Considerations for Management Research.” Journal of Management 41 (2015): 491–520.

This is a critique of the Bayesian structural equation model advocated for in Muthén and Asparouhov 2012, cited under Bayesian General and Generalized Latent Variable Models. The researchers outline their concerns with the model advocated for in Muthén and Asparouhov 2012 and provide their own recommendations for using Bayesian methods to estimate structural equation models.

Find this resource: