In This Article Expand or collapse the "in this article" section Selective Sweeps

  • Introduction
  • General Overview
  • Textbooks
  • Journals
  • Soft Sweeps
  • Conclusion

Evolutionary Biology Selective Sweeps
Wolfgang Stephan, Pavlos Pavlidis
  • LAST MODIFIED: 21 June 2024
  • DOI: 10.1093/obo/9780199941728-0153


A large fraction of the genome of all organisms studied to date is subject to mutations that are effectively neutral with respect to their fitness effects, and hence evolve under genetic drift, as described by the neutral theory. In an extended version, this theory also agrees with the observation that the great majority of newly arising mutations that do affect fitness are mildly deleterious, and the predominant mode of natural selection is purifying in nature, removing these deleterious mutations from populations. Natural populations are rarely at demographic equilibrium, and commonly have undergone recent historical changes. The combined effects of population size changes, population structure, and migration shape patterns of within-species genetic variation. Although these effects are genome-wide, they cannot be assumed to affect patterns of variation uniformly across the genome, and indeed may produce different effects in different genomic regions, mimicking expectations under selection. A combination of genetic drift (as modulated by the demographic history of the population) with both direct and linked purifying selection shapes patterns of genomic variation. Thus, a baseline model taking joint account of all of these effects is essential for genomic analysis. Although beneficial mutations are a comparatively small fraction of all new mutations, some of them may reach fixation or high frequencies and are thus important in evolution. If the fitness effects of these beneficial mutations are sufficiently strong, they may cause selective sweeps (i.e., localized reductions of genetic variation along genomes). Such localized patterns of reduced genetic variation have been convincingly described in a variety of organisms. In some cases, these genetic changes have been meaningfully connected with both phenotype and fitness. The effects of these comparatively rare, localized positive selection events are best characterized and quantified as additional to the genome-wide processes just described. However, in the absence of an appropriate baseline model accounting for these processes that are common to the genome as a whole, adaptive storytelling may result. For more than twenty years, inference methods have been developed to detect selective sweeps and localize the targets of positive selection in the genome. These methods are based on population genetic models that—in the simplest case—describe the effect of an individual beneficial allele on linked neutral variation (driven from a single copy to fixation). Such a single-sweep model may be extended to recurrent selective sweeps, in which sweeps occur along the genome independently according to a time-homogeneous process. In addition, soft sweeps may occur, which are either caused after an environmental change by positive selection on standing genetic variation or by multiple adaptive mutations.

General Overview

Selective sweeps were first detected in bacteria and called periodic selection in Atwood, et al. 1951. Suppose a new, selectively favored mutation arises on a non-recombining haplotype that carries a given set of neutral variants. If the favored mutation goes to fixation, the neutral variants linked to the selected mutation will also spread to fixation, while the other variants get lost. As a consequence, at the time of fixation of the beneficial allele, genetic variation on the entire haplotype is completely eliminated. In the presence of recombination, however, the size of the region of reduced variation (around the locus under selection) is limited to a relatively small fraction of the genome. This was demonstrated in Maynard Smith and Haigh 1974 for an infinitely large population, based on a mathematical model with one neutral and a partially linked selected locus. The authors called this process genetic hitchhiking. This work by Maynard Smith and Haigh was stimulated by the observation in Lewontin 1974 that allozyme variability levels are only weakly related to population size, which contradicts predictions of the neutral theory. Indeed, if hitchhiking events occur frequently, the hitchhiking model may suggest that the observed pattern of genetic variability in a species would depend more on selection than genetic drift determined by effective population size Ne. In the 1970s, Maynard Smith and Haigh’s work was therefore either original and exciting for the community or it provoked strong rebuttals (Ohta and Kimura 1975). The hitchhiking effect was revisited in the late 1980s to explain patterns of reduced variation in RFLP (restriction fragment length polymorphism) data. These patterns were found in genomic regions of low recombination rates (Aguadé, et al. 1989). Begun and Aquadro 1992 corroborated these results by showing that levels of DNA variation correlate with recombination rates across much of the Drosophila melanogaster genome. The deterministic hitchhiking model in Maynard Smith and Haigh 1974 was extended in Kaplan, et al. 1989, which analyzed a stochastic version of the process (including genetic drift). Furthermore, Wiehe and Stephan 1993 studied recurrent hitchhiking. Since Charlesworth, et al. 1993 demonstrated that new strongly deleterious mutations entering a population are eliminated by selection together with any linked neutral variants, this so-called background selection model has competed with the hitchhiking model in explaining the observed reduction of variation in the genome. As background selection also involves a form of hitchhiking, the process of hitchhiking driven by positive selection was subsequently called selective sweep. The last major shift in the study of selective sweeps occurred around 2000, concomitant with the advent of genomics. Selective sweeps have thus become a major concept in evolutionary genetics, as the observed patterns of neutral or nearly neutral variation may be used to infer selective events along the genome (for more details, see Stephan 2019).

  • Aguadé, M., M. Miyashita, and C. H. Langley. 1989. Reduced variation in the yellow-achaete-scute region in natural populations of Drosophila melanogaster. Genetics 122:607–615.

    DOI: 10.1093/genetics/122.3.607

    The first report of an RFLP study of a gene region (of about 106 kb) in Drosophila melanogaster that found very low levels of nucleotide diversity. This gene region is located very close to the telomere of the X chromosome, where crossing over is reduced. These data and similar results from other gene regions in D. melanogaster (and other Drosophila species) provoked new work on the hitchhiking model and led to the development of the model of background selection.

  • Atwood, K. C., L. K. Schneider, and F. J. Ryan. 1951. Periodic selection in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America 37:146–155.

    DOI: 10.1073/pnas.37.3.146

    A description of the experiments leading to the concept of periodic selection. This process was first observed in experimental populations of E. coli, which had been set up to be polymorphic for easily scored, neutral marker variants. When a favorable variant arises in association with a given neutral variant, the frequency of the marker variant increases. However, the rise in frequency ceases when mutations back to the other variant occur.

  • Begun, D. J., and C. F. Aquadro. 1992. Levels of naturally occurring DNA polymorphism correlate with recombination rate in D. melanogaster. Nature 356:519–520.

    DOI: 10.1038/356519a0

    Using published RFLP data for twenty gene regions across much of the D. melanogaster genome, this study found that levels of DNA variation correlate with recombination rates. In contrast, average divergence to D. simulans at these loci was hardly affected by recombination. Similar observations were later reported for other species, including humans and plants.

  • Charlesworth, B., M. T. Morgan, and D. Charlesworth. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134:1289–1303.

    DOI: 10.1093/genetics/134.4.1289

    Similar to the model of recurrent selective sweeps, the model of background selection has been used to explain the observed reduction of nucleotide variation in genomic regions of reduced recombination rates. The efforts to distinguish between both models initiated an important phase in molecular population genetics in the middle and late 1990s.

  • Kaplan, N. L., R. R. Hudson, and C. H. Langley. 1989. The “hitchhiking effect” revisited. Genetics 123:887–899.

    DOI: 10.1093/genetics/123.4.887

    An extension of the hitchhiking model of Maynard Smith and Haigh 1974 as it is applied to populations of finite constant size. The analysis is based on coalescent theory. However, to obtain results such as the reduction of neutral variation around a selected locus, extensive numerical calculations are required. In this study, a model of recurrent selective sweeps used in Wiehe and Stephan 1993 is also proposed.

  • Lewontin, R. C. 1974. The genetic basis of evolutionary change. New York: Columbia Univ. Press.

    In chapter 5 the author points out the “Paradox of Variation,” i.e., the inability of the neutral theory to explain the narrow range of observed average heterozygosities (in allozyme variability) between species, given their great variation in population size. A convincing solution of this problem has not been found yet. Many genetic, demographic, and environmental factors, including positive selection, may contribute to this discrepancy.

  • Maynard Smith, J., and J. Haigh. 1974. The hitch-hiking effect of a favourable gene. Genetical Research 23:23–35.

    DOI: 10.1017/S0016672300014634

    Generally considered the pivotal study that pioneered the concept of genetic hitchhiking in population genetics and its subsequent use in population genomic analyses. The authors proposed a deterministic model with a neutral and a linked selected locus to analyze the effect of a newly arising favorable allele on neutral polymorphism. The mathematics of the paper is relatively basic (recurrence and ordinary differential equations).

  • Ohta, T., and M. Kimura. 1975. The effect of a selected linked locus on heterozygosity of neutral alleles (the hitch-hiking effect). Genetical Research 25:313–326.

    DOI: 10.1017/S0016672300015731

    Considers the situation in which a neutral mutation appears in the population while a selected allele at a linked locus is on its way to fixation. Using a diffusion approach in combination with numerical analysis of the ensuing moment equations and Monte Carlo simulations, the authors show that the hitchhiking effect is generally unimportant as a mechanism for reducing heterozygosity.

  • Stephan, W. 2019. Selective sweeps. Genetics 211:5–13.

    DOI: 10.1534/genetics.118.301319

    A recent perspective on selective sweeps—i.e., it contains similar topics as this review. However, some topics are presented in more detail, including the historical development of this research field and the empirical evidence for selective sweeps.

  • Wiehe, T. H. E., and W. Stephan. 1993. Analysis of a genetic hitchhiking model, and its application to DNA polymorphism data from Drosophila melanogaster. Molecular Biology and Evolution 10:842–854.

    A simple formula for the expected level of equilibrium nucleotide diversity as a function of the local recombination rate is derived for recurrent selective sweeps—i.e., sweeps that occur sequentially at multiple selected loci. This formula is then applied to the data published in Begun and Aquadro 1992.

back to top

Users without a subscription are not able to see the full content on this page. Please subscribe or login.

How to Subscribe

Oxford Bibliographies Online is available by subscription and perpetual access to institutions. For more information or to contact an Oxford Sales Representative click here.