Estimating viability

Introduction

Being able to make predictions with known (or estimated) viabilities, doesn’t do us a heck of a lot of good unless we can figure out what those viabilities are. Fortunately, figuring them out isn’t too hard.1 If we know the number of individuals of each genotype before selection, it’s really easy as a matter of fact.2 Consider that our data looks like this:

Genotype \(A_1A_1\) \(A_1A_2\) \(A_2A_2\)
Number in zygotes \(n_{11}^{(z)}\) \(n_{12}^{(z)}\) \(n_{22}^{(z)}\)
Viability \(w_{11}\) \(w_{12}\) \(w_{22}\)
Number in adults \(n_{11}^{(a)} = w_{11}n_{11}^{(z)}\) \(n_{12}^{(a)} = w_{12}n_{12}^{(z)}\) \(n_{22}^{(a)} = w_{22}n_{22}^{(z)}\)

In other words, estimating the absolute viability simply consists of estimating the probability that an individuals of each genotype that survive from zygote to adult. The maximum-likelihood estimate is, of course, just what you would probably guess: \[w_{ij} = \frac{n_{ij}^{(a)}}{n_{ij}^{(z)}} \quad ,\] Since \(w_{ij}\) is a probability and the outcome is binary (survive or die), you should be able to guess what kind of likelihood relates the observed data to the unseen parameter, namely, a binomial likelihood. In Stan notation:3

   n_11_adult ~ binomial(n_11_zygote, w_11)
   n_12_adult ~ binomial(n_12_zygote, w_12)
   n_22_adult ~ binomial(n_22_zygote, w_22)

Estimating relative viability

To estimate absolute viabilities, we have to be able to identify genotypes non-destructively, because we have to know what their genotype was both before the selection event and after the selection event. That’s fine if we happen to be dealing with an experimental situation where we can do controlled crosses to establish known genotypes or if we happen to be studying an organism and a trait where we can identify the genotype from the phenotype of a zygote (or at least a very young individual) and from surviving adults.4 What do we do when we can’t follow the survival of individuals with known genotype? Give up?5

Remember that to make inferences about how selection will act, we only need to know relative viabilities, not absolute viabilities.6 We still need to know something about the genotypic composition of the population before selection, but it turns out that if we’re only interested in relative viabilities, we don’t need to follow individuals. All we need to be able to do is to score genotypes and estimate genotype frequencies before and after selection. Our data looks like this:

Genotype \(A_1A_1\) \(A_1A_2\) \(A_2A_2\)
Frequency in zygotes \(x_{11}^{(z)}\) \(x_{12}^{(z)}\) \(x_{22}^{(z)}\)
Frequency in adults \(x_{11}^{(a)}\) \(x_{12}^{(a)}\) \(x_{22}^{(a)}\)

We also know that \[\begin{aligned} x_{11}^{(a)} &=& w_{11}x_{11}^{(z)}/\bar w \\ x_{12}^{(a)} &=& w_{12}x_{12}^{(z)}/\bar w \\ x_{22}^{(a)} &=& w_{22}x_{22}^{(z)}/\bar w \quad . \end{aligned}\] Suppose we now divide all three equations by the middle one: \[\begin{aligned} x_{11}^{(a)}/x_{12}^{(a)} &=& w_{11}x_{11}^{(z)}/w_{12}x_{12}^{(z)} \\ 1 &=& 1 \\ x_{22}^{(a)}/x_{12}^{(a)} &=& w_{22}x_{22}^{(z)}/w_{12}x_{12}^{(z)} \quad , \end{aligned}\] or, rearranging a bit \[\begin{aligned} \frac{w_{11}}{w_{12}} &=& \left(\frac{x_{11}^{(a)}}{x_{12}^{(a)}}\right) \left(\frac{x_{12}^{(z)}}{x_{11}^{(z)}}\right) \label{eq:est-rel-viability-1} \\ \frac{w_{22}}{w_{12}} &=& \left(\frac{x_{22}^{(a)}}{x_{12}^{(a)}}\right) \left(\frac{x_{12}^{(z)}}{x_{22}^{(z)}}\right) \quad . \label{eq:est-rel-viability-2} \end{aligned}\] This gives us a complete set of relative viabilities.

Genotype \(A_1A_1\) \(A_1A_2\) \(A_2A_2\)
Relative viability \(\frac{w_{11}}{w_{12}}\) 1 \(\frac{w_{22}}{w_{12}}\)

If we use the maximum-likelihood estimates for genotype frequencies before and after selection, we obtain maximum likelihood estimates for the relative viabilities.7 If we use Bayesian methods to estimate genotype frequencies before and after selection (including the uncertainty around those estimates), we can use these formulas to get Bayesian estimates of the relative viabilities (and the uncertainty around them).

An example

Let’s see how this works with some real data from Dobzhansky’s work on chromosome inversion polymorphisms in Drosophila pseudoobscura.8

Genotype \(ST/ST\) \(ST/CH\) \(CH/CH\) Total
Number in larvae 41 82 27 150
Number in adults 57 169 29 255

You may be wondering how the sample of adults can be larger than the sample of larvae. That’s because to score an individual’s inversion type, Dobzhansky had to kill it. The numbers in larvae are based on a sample of the population, and the adults that survived were not genotyped as larvae. As a result, all we can do is to estimate the relative viabilities. \[\begin{aligned} \frac{w_{11}}{w_{12}} &=& \left(\frac{x_{11}^{(a)}}{x_{12}^{(a)}}\right) \left(\frac{x_{12}^{(z)}}{x_{11}^{(z)}}\right) = \left(\frac{57/255}{169/255}\right) \left(\frac{82/150}{41/150}\right) = 0.67 \\ \frac{w_{22}}{w_{12}} &=& \left(\frac{x_{22}^{(a)}}{x_{12}^{(a)}}\right) \left(\frac{x_{12}^{(z)}}{x_{22}^{(z)}}\right) = \left(\frac{29/255}{169/255}\right) \left(\frac{82/150}{27/150}\right) = 0.52 \quad . \end{aligned}\] So it looks as if we have balancing selection, i.e., the fitness of the heterozygote exceeds that of either homozygote.

We can check to see whether this conclusion is statistically justified by comparing the observed number of individuals in each genotype category in adults with what we’d expect if all genotypes were equally likely to survive.

Genotype \(ST/ST\) \(ST/CH\) \(CH/CH\)
Expected \(\left(\frac{41}{150}\right)255\) \(\left(\frac{82}{150}\right)255\) \(\left(\frac{27}{150}\right)255\)
69.7 139.4 45.9
Observed 57 169 29
\(\chi^2_2 = 14.82\), \(P < 0.001\)

So we have strong evidence that genotypes differ in their probability of survival.

We can also use our knowledge of how selection works to predict the genotype frequencies at equilibrium: \[\begin{aligned} \frac{w_{11}}{w_{12}} &=& 1 - s_1 \\ \frac{w_{22}}{w_{12}} &=& 1 - s_2 \quad . \end{aligned}\] So \(s_1 = 0.33\), \(s_2 = 0.48\), and the predicted equilibrium frequency of the \(ST\) chromosome is \(s_2/(s_1+s_2) = 0.59\).

Creative Commons License

These notes are licensed under the Creative Commons Attribution License. To view a copy of this license, visit or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.