Genetic Drift

So far in this course we’ve talked about changes in genotype and allele frequencies as if they were completely deterministic. Given the current allele frequencies and viabilities, for example, we wrote down an equation describing how they will change from one generation to the next: \[p' = \frac{p^2w_{11} + pqw_{12}}{\bar w} \quad .\] Notice that in writing this equation, we’re claiming that we can predict the allele frequency in the next generation without error. But suppose the population is small, say 10 diploid individuals, and our prediction is that $p' = 0.5$. Then just as we wouldn’t be surprised if we flipped a coin 20 times and got 12 heads, we shouldn’t be surprised if we found that $p' = 0.6$. The difference between what we expect ($p' = 0.5$) and what we observe ($p' = 0.6$) can be chalked up to statistical sampling error. What’s different is that in this case, it’s the biological process that’s producing the sampling error, not us. The biological sampling error is the cause of (or just another name for) genetic driftthe tendency for allele frequencies to change from one generation to the next in a finite population even if there is no selection.

A simple example

To understand in more detail what happens when there is genetic drift, let’s consider the simplest possible example: a haploid population consisting of 2 individuals.¹ Suppose that we are studying a locus with only two alleles in this population $A_1$ and $A_2$. This implies that $p = q = 0.5$, but we’ll ignore that numerical fact for now and simply label the frequency of the $A_1$ allele as $p$ and the frequency of the $A_2$ allele as $q$.

Then it’s not too hard to see that \[\begin{aligned} \mbox{Probability that both offspring are $A_1$} &=& p^2 \\ \mbox{Probability that one offspring is $A_1$ and one is $A_2$} &=& 2pq \\ \mbox{Probability that both offspring are $A_2$} &=& q^2 \end{aligned}\] Of course $p' = 1$ if both offspring sampled are $A_1$, $p' = 1/2$ if one is $A_1$ and one is $A_2$, and $p' = 0$ if both are $A_2$, so that set of equations is equivalent to this one: \[\begin{aligned} P(p'=1) &=& p^2 \label{eq:p-1} \\ P(p'=1/2) &=& 2pq \\ P(p'=0) &=& q^2 \label{eq:p-2} \end{aligned}\] In other words, we can no longer predict with certainty what allele frequencies in the next generation will be. We can only assign probabilities to each of the three possible outcomes. Of course, in a larger population the amount of uncertainty about the allele frequencies will be smaller,² but there will be some uncertainty associated with the predicted allele frequencies unless the population is infinite.

The probability of ending up in any of the three possible states obviously depends on the current allele frequency. In probability theory we express this dependence by writing equations ([eq:p-1])–([eq:p-2]) as conditional probabilities: \[\begin{aligned} P(p_1=1|p_0) &=& p_0^2 \label{eq:p-1-1} \\ P(p_1=1/2|p_0) &=& 2p_0q_0 \\ P(p_1=0|p_0) &=& q_0^2 \label{eq:p-1-2} \end{aligned}\] We read that first equation as “the probability that the allele frequency in generation 1 ($p_1$) is 1 given the allele frequency in generation 0 ($p_0$).” I’ve introduced the subscripts so that we can distinguish among various generations in the process. Why? Because if we can write equations ([eq:p-1-1])–([eq:p-1-2]), we can also write the following equations:³ \[\begin{aligned} P(p_2=1|p_1) &=& p_1^2 \\ P(p_2=1/2|p_1) &=& 2p_1q_1 \\ P(p_2=0|p_1) &=& q_1^2 \end{aligned}\] Now if we stare at those a little while, we⁴ begin to see some interesting possibilities. Namely,

\[\begin{aligned} P(p_2=1|p_0) &=& P(p_2=1|p_1=1)P(p_1=1|p_0) + P(p_2=1|p_1=1/2)P(p_1=1/2|p_0) \\ &=& (1)(p_0^2) + (1/4)(2p_0q_0) \\ &=& p_0^2 + (1/2)p_0q_0 \\ P(p_2=1/2|p_0) &=& P(p_2=1/2|p_1=1/2)P(p_1=1/2|p_0) \\ &=& (1/2)(2p_0q_0) \\ &=& p_0q_0 \\ P(p_2=0|p_0) &=& P(p_2=0|p_1=0)P(p_1=0|p_0) + P(p_2=0|p_1=1/2)P(p_1=1/2|p_0) \\ &=& (1)(q_0^2) + (1/4)(2p_0q_0) \\ &=& q_0^2 + (1/2)p_0q_0 \end{aligned}\] It takes more algebra than I care to show,⁵ but these equations can be extended to an arbitrary number of generations. \[\begin{aligned} P(p_t=1|p_0) &=& p_0^2 + \left(1 - (1/2)^{t-1}\right)p_0q_0 \\ P(p_t=1/2|p_0) &=& p_0q_0(1/2)^{t-2} \\ P(p_t=0|p_0) &=& q_0^2 + \left(1 - (1/2)^{t-1}\right)p_0q_0 \end{aligned}\]

Why do I bother to show you these equations?⁶ Because you can see pretty quickly that as $t$ gets big, i.e., the longer our population evolves, the smaller the probability that $p_t = 1/2$ becomes. In fact, it’s not hard to verify several facts about genetic drift in this simple situation:

All of these properties are true in general for any finite population and any number of alleles, provided that there is no mutation and no input of new genetic material from other popullations.

General properties of genetic drift

What I’ve shown you so far applies only to a haploid population with two individuals. Even I will admit that it isn’t a very interesting situation. Suppose, however, we now consider a populaton with $N$ diploid individuals. We can treat it as if it were a population of $2N$ haploid individuals using a direct analogy to the process I described earlier, and then things start to get a little more interesting.⁸

We can then write a general expression for how allele frequencies will change between generations. Specifically, the distribution describing the probability that there will be $j$ copies of $A_1$ in the next generation given that there are $i$ copies in this generation is \[P(\hbox{$j$ $A_1$ in offspring $|$ $i$ $A_1$ in parents}) = {2N \choose j}\left(\frac{i}{2N}\right)^j\left(1 - \frac{i}{2N}\right)^{2N-j} \quad ,\] i.e., a binomial distribution. I’ll be astonished if any of what I’m about to say is apparent to any of you from looking at this equation, but it implies three really important things. We’ve encountered the first two of them already:

Variance of allele frequencies between generations

\[\begin{aligned} P(K=k) &=& {{N \choose k}p^k(1-p)^{N-k}} \\ \hbox{Var}(K) &=& Np(1-p) \\ \hbox{Var}(p) &=& \hbox{Var}(K/N) \\ &=& \frac{1}{N^2}\hbox{Var}(K) \\ &=& \frac{p(1-p)}{N} \end{aligned}\]

\[P(K=k) = {{N \choose k}p^k(1-p)^{N-k}}\] \[\hbox{Var}(K) = Np(1-p)\] \[\hbox{Var}(p) = \hbox{Var}(K/N)\] \[= \frac{1}{N^2}\hbox{Var}(K)\] \[= \frac{p(1-p)}{N}\]

Applying this to our situation, \[\hbox{Var}(p_{t+1}) = \frac{p_t(1-p_t)}{2N}\] Var$(p_{t+1})$ measures the amount of uncertainty about allele frequencies in the next generation, given the current allele frequency. As you probably guessed long ago, the amount of uncertainty is inversely proportional to population size. The larger the population, the smaller the uncertainty.

If you think about this a bit, you might expect that a smaller variance would “slow down” the process of genetic driftand you’d be right. It takes some pretty advanced mathematics to say how much the process slows down as a function of population size,¹¹ but we can summarize the result in the following equation: \[\bar t \approx -4N\left(p\log p + (1-p)\log(1-p)\right) \quad ,\] where $\bar t$ is the average time to fixation of one allele or the other and $p$ is the current allele frequency.¹² So the average time to fixation of one allele or the other increases approximately linearly with increases in the population size.

Analogy to inbreeding

You may have noticed a similarity between drift and inbreeding. Specifically, both processes lead to a loss of heterozygosity and an increase in homozygosity. This analogy leads to a useful heuristic for helping us to understand the dynamics of genetic drift.¹³

Remember our old friend $f$, the inbreeding coefficient? I’m going to re-introduce you to it in the form of the population inbreeding coefficient, the probability that two alleles chosen at random from a population are identical by descent. We’re going to study how the population inbreeding coefficient changes from one generation to the next as a result of reproduction in a finite population.¹⁴

\[\begin{aligned} f_{t+1} &=& \mbox{Prob. ibd from preceding generation} \\ && + (\mbox{Prob. not ibd from prec. gen.}) \times (\mbox{Prob. ibd from earlier gen.}) \\ &=& \frac{1}{2N} + \left(1 - \frac{1}{2N}\right)f_t \end{aligned}\] or, in general, \[f_{t+1} = 1 - \left(1 - \frac{1}{2N}\right)^t(1-f_0) \quad .\]

Summary

There are four characteristics of genetic drift that are particularly important for you to remember:

Effective population size

I didn’t make a big point of it, but in our discussion of genetic drift so far we’ve assumed everything about populations that we assumed to derive the Hardy-Weinberg principle, and we’ve assumed that:

How do we deal with the fact that one or more of these conditions will be violated in just about any case we’re interested in?¹⁸ One way would be to develop all the probability models that incorporate that complexity and try to solve them. That’s nearly impossible, except through computer simulations. Another, and by far the most common approach, is to come up with a conversion formula that makes our actual population seem like the “ideal” population that we’ve been studying. That’s exactly what effective population size is.

What does that phrase “same properties with respect to genetic drift” mean? Well there are two ways it can be defined.¹⁹

Variance effective size

You may remember²⁰ that the variance in allele frequency in an ideal population is \[Var(p_{t+1}) = \frac{p_t(1-p_t)}{2N} \quad.\] So one way we can make our actual population equivalent to an ideal population to make their allele frequency variances the same. We do this by calculating the variance in allele frequency for our actual population, figuring out what size of ideal population would produce the same variance, and pretending that our actual population is the same as an ideal population of the same size. To put that into an equation,²¹ let $\widehat{\mbox{Var}}(p)$ be the variance we calculate for our actual population. Then \[N_e^{(v)} = \frac{p(1-p)}{2\widehat{\mbox{Var}}(p)}\] is the variance effective population size, i.e., the size of an ideal population that has the same properties with respect to allele frequency variance as our actual population. Where did that equation come from? Well, if we solve for $\widehat{\mbox{Var}}(p)$, we get this: \[\widehat{\mbox{Var}}(p) = \frac{p(1-p)}{2N_e^{(v)} } \quad ,\] which is precisely the equation for the variance in allele frequency in a population with allele frequency $p$ and effective size $N_e$.

Inbreeding effective size

You may also remember that we can think of genetic drift as analogous to inbreeding. The probability of identity by descent within populations changes in a predictable way in relation to population size, namely \[f_{t+1} = \frac{1}{2N} + \left(1 - \frac{1}{2N}\right)f_t \quad.\] So another way we can make our actual population equivalent to an ideal population is to make them equivalent with respect to how $f$ changes from generation to generation. We do this by calculating how the inbreeding coefficient changes from one generation to the next in our actual population, figuring out what size an ideal population would have to be to show the same change between generations, and pretending that our actual population is the same size at the ideal one. So suppose $\hat f_t$ and $\hat f_{t+1}$ are the actual inbreeding coefficients we’d have in our population at generation $t$ and $t+1$, respectively. Then \[\begin{aligned} \hat f_{t+1} &=& \frac{1}{2N_e^{(f)}} + \left(1 - \frac{1}{2N_e^{(f)}}\right)\hat f_t \\ &=& \left(\frac{1}{2N_e^{(f)}}\right)(1 - \hat f_t) + \hat f_t \\ \hat f_{t+1} - \hat f_t &=& \left(\frac{1}{2N_e^{(f)}}\right)(1 - \hat f_t) \\ N_e^{(f)} &=& \frac{1 - \hat f_t}{2(\hat f_{t+1} - \hat f_t)} \quad . \end{aligned}\] In many applications it’s convenient to assume that $\hat f_t = 0$. In that case the calculation gets much simpler: \[N_e^{(f)} = \frac{1}{2\hat f_{t+1}} \quad .\] We also don’t lose anything by taking the simpler approach, because $N_e^{(f)}$ depends only on how much $f$ changes from one generation to the next, not on its actual magnitude.

Comments on effective population sizes

Those are nice tricks, but there are some important limitations. The biggest is that $N_e^{(v)} \ne N_e^{(f)}$ if the population size is changing from one generation to the next.²² So you have to decide which of these two measures is more appropriate for the question you’re studying.

Examples

This is all pretty abstract. Let’s work through some examples to see how this all plays out.²³ In the case of separate sexes and variable population size, I’ll provide a derivation of $N_e^{(f)}$. In the case of differences in the number of offspring left by individuals, I’ll just give you the formula and we’ll discuss some of the implications.

Separate sexes

We’ll start by assuming that $\hat f_t = 0$ to make the calculations simple. So we know that \[N_e^{(f)} = \frac{1}{2\hat f_{t+1}} \quad .\] The first thing to do is to calculate $\hat f_{t+1}$. To do this we have to break the problem down into pieces.²⁴

\[\begin{aligned} f_{t+1} &=& \left(\frac{1}{2}\right) \left(\frac{N-1}{2N-1}\right) \left(\frac{1}{2N_f}\right) + \left(\frac{1}{2}\right) \left(\frac{N-1}{2N-1}\right) \left(\frac{1}{2N_m}\right) \\ &=& \left(\frac{1}{2}\right) \left(\frac{N-1}{2N-1}\right) \left(\frac{1}{2N_f} + \frac{1}{2N_m}\right) \\ &\approx& \left(\frac{1}{4}\right) \left(\frac{2N_m + 2N_f}{4N_fN_m}\right) \\ &=& \left(\frac{1}{2}\right) \left(\frac{N_m + N_f}{4N_fN_m}\right) \end{aligned}\] So, \[N_e^{(f)} \approx \frac{4N_fN_m}{N_f + N_m} \quad .\]

What does this all mean? Well, consider a couple of important examples. Suppose the numbers of females and males in a population are equal, $N_f = N_m = N/2$. Then \[\begin{aligned} N_e^{(f)} &=& \frac{4(N/2)(N/2)}{N/2 + N/2} \\ &=& \frac{4N^2/4}{N} \\ &=& N \quad . \end{aligned}\] The effective population size is equal to the actual population size if the sex ratio is 50:50. If it departs from 50:50, the effective population size will be smaller than the actual population size.

Consider the extreme case where there’s only one reproductive male in the population. Then \[N_e^{(f)} = \frac{4N_f}{N_f + 1} \quad . \label{eq:ne-harem}\] Notice what this equation implies: The effective size of a population with only one reproductive male (or female) can never be bigger than 4, no matter how many mates that individual has and no matter how many offspring are produced. At first, this is a little surprising, but when you realize that under these conditions all offspring are half sibs, it may be a little less surprising. Since every individual in the population inherited one of two alleles from the male (their father), there’s a one in four chance that any two alleles taken at random are identical by descent.

Variable population size

The notation for this one gets a little more complicated, but the ideas are simpler than those you just survived. Since the population size is changing we need to specify the population size at each time step. Let $N_t$ be the population size in generation $t$. Then \[\begin{aligned} f_{t+1} &=& \left(1-\frac{1}{2N_t}\right)f_t + \frac{1}{2N_t} \\ 1 - f_{t+1} &=& \left(1-\frac{1}{2N_t}\right)(1-f_t) \\ 1 - f_{t+K} &=& \left(\prod_{i=1}^K\left(1-\frac{1}{2N_{t+i}}\right)\right)(1-f_t) \quad . \end{aligned}\] Now if the population size were constant \[\left(\prod_{i=1}^K\left(1-\frac{1}{2N_{t+i}}\right)\right) = \left(1 - \frac{1}{2N_e^{(f)}}\right)^K \quad .\] Dealing with products and powers is inconvenient, but if we take the logarithm of both sides of the equation we get something simpler:²⁵ \[\sum_{i=1}^K\log\left(1-\frac{1}{2N_{t+i}}\right) = K\log\left(1 - \frac{1}{2N_e^{(f)}}\right) \quad .\] It’s a well-known fact²⁶ that $\log(1-x) \approx -x$ when $x$ is small. So if we assume that $N_e$ and all of the $N_{t}$ are large,²⁷ then \[\begin{aligned} K\left(-\frac{1}{2N_e^{(f)}}\right) &=& \sum_{i=1}^K-\frac{1}{2N_{t+i}} \\ \frac{K}{N_e^{(f)}} &=& \sum_{i=1}^K\frac{1}{N_{t+i}} \\ N_e^{(f)} &=& \left(\left(\frac{1}{K}\right) \sum_{i=1}^K\frac{1}{N_{t+i}}\right)^{-1} \end{aligned}\]

The quantity on the right side of that last equation is a well-known quantity. It’s the harmonic mean of the $N_{t}$. It’s another well-known fact²⁸ that the harmonic mean of a series of numbers is always less than its arithmetic mean. This means that genetic drift may play a much more imporant role than we might have imagined, since the effective size of a population will be more influenced by times when it is small than by times when it is large.

Consider, for example, a population in which $N_1$ through $N_9$ are 1000, and $N_{10}$ is 10. \[\begin{aligned} N_e &=& \left(\left(\frac{1}{10}\right) \left(9\left(\frac{1}{1000}\right) + \left(\frac{1}{10}\right)\right)\right)^{-1} \\ &\approx& 92 \end{aligned}\] versus an arithmetic average of 901. So the population will behave with respect to the inbreeding associated with drift like a population a tenth of its arithmetic average size.

Variation in offspring number

I’m just going to give you this formula. I’m not going to derive it for you.²⁹ \[N_e^{(f)} = \frac{2N - 1}{1 + \frac{V_k}{2}} \quad ,\] where $V_k$ is the variance in number of offspring among individuals in the population. Remember I told you that the number of gametes any individual has represented in the next generation is a binomial random variable in an ideal population? Well, if the population size isn’t changing, that means that $V_k = 2(1 - 1/N)$ in an ideal population.³⁰ A little algebra should convince you that in this case $N_e^{(f)} = N$. It can also be shown (with more algebra) that

That last fact is pretty remarkable. Conservation biologists try to take advantage of it to decrease the loss of genetic variation in small populations, especially those that are captive bred. If you can reduce the variance in reproductive success, you can substantially increase the effective size of the population. In fact, if you could reduce $V_k$ to zero, then \[N_e^{(f)} = 2N - 1 \quad .\] The effective size of the population would then be almost twice its actual size.

Creative Commons License

These notes are licensed under the Creative Commons Attribution License. To view a copy of this license, visit or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

Notice that once we start talking about genetic drift, we have to specify the size of the population. As we’ll see, that’s because the properties of drift depend on how big the population is. We’ll also see that the size of the population isn’t simply the number of individuals we can count.↩︎
More about that later.↩︎
I know. I’m weird. I actually get a kick out of writing equations!↩︎
Or at least the weird ones among us↩︎
Ask me, if you’re really interested.↩︎
It’s not just that I’m crazy, and it’s not that I’m trying to scare you with algebra.↩︎
You obviously can’t lose all of them unless the population becomes extinct.↩︎
Notice, however, that we are making some very important simplifying assumptions that won’t apply to any real population.↩︎
Technically, we’ve described a Markov chain with a finite state space, but I doubt that you really care about that. All Markov chains have this “memoryless” property. In fact, it’s called the Markov property (https://en.wikipedia.org/wiki/Markov_property).↩︎
Of course, if you’ve just tossed 25 heads in a row, you could be forgiven for having your doubts about whether the coin is actually fair.↩︎
Actually, we’ll encounter a way that isn’t quite so hard next week when we get to the coalescent.↩︎
Notice that this equation only applies to the case of one-locus with two alleles, although the principle applies to any number of alleles.↩︎
But keep in mind. This is only an analogy. Heterozygosity is lost with inbreeding, but the allele frequency doesn’t change. Heterozygosity is lost with drift, because the allele frequency is changing, leading to loss of one of the alleles.↩︎
Remember that I use the abbreviation ibd to mean identical by descent.↩︎
This will hold true even if there is strong selection for keeping alleles in the population. Selection can’t prevent loss of diversity, only slow it down.↩︎
How could there be separate sexes if there can be self-fertilization?↩︎
More about this later.↩︎
OK, OK. They will probably be violated in every case we’re interested in.↩︎
There are actually more than two ways, but we’re only going to talk about two.↩︎
You probably won’t, so I’ll remind you↩︎
As if that will make it any clearer. Does anyone actually read these footnotes?↩︎
It’s even worse than that. When the population size is changing, it’s not clear that any of the available adjustments to produce an effective population size are entirely satisfactory. Well, that’s not entirely true either. Fu et al. show that there is a reasonable definition in one simple case when the population size varies, and it happens to correspond to the solution presented below.↩︎
If you’re interested in a comprehensive list of formulas relating various demographic parameters to effective population size, take a look at . They provide a pretty comprehensive summary and a number of derivations.↩︎
Remembering, of course, that $\hat f_{t+1}$ is the probability that two alleles drawn at random are identical by descent.↩︎
OK. I know it doesn’t look any simpler, but trust me it is. We can work with this one. The other one we can only stare at.↩︎
Well known to some of us at least.↩︎
So that their reciprocals are small↩︎
Are we ever going to run out of well-known facts? Probably not.↩︎
The details are in , if you’re interested.↩︎
The calculation is really easy, and I’d be happy to show it to you if you’re interested.↩︎

Introduction