next up previous
Next: Fu's Up: Tajima's , Fu's , Previous: Tajima's , Fu's ,

Introduction

We saw last time that comparing two estimators of $\theta = 4N_e\mu$ can help us to determine whether patterns of diversity within populations are consistent with neutral expectations or not. Specifically, let

\begin{displaymath}
\hat\pi = \sum \hat x_i\hat x_j\delta_{ij}/N \quad ,
\end{displaymath}

be the observed nucleotide heterozygosity and let $\hat k$ be the observed number of segregating sites in a sample, then

\begin{eqnarray*}
\hat \theta_\pi &=& \hat \pi \\
\hat \theta_k &=& \frac{\hat k}{\sum_i^{n-1}\frac{1}{i}} \quad ,
\end{eqnarray*}

where $n$ is the number of sequences in your sample, and

\begin{displaymath}
\hat D = \hat\theta_\pi - \hat\theta_k
\quad.
\end{displaymath}

$\hat D > 0$ suggests either a recent population bottleneck or some form of balancing selection. $\hat D < 0$ suggests either population expansion or purifying selection. A quick check in Web of Science reveals that the paper in which Tajima described this approach [4] has been cited over 3100 times since 1994--900 times since I last taught this course two years ago. Clearly it has been widely used for interpreting patterns of nucleotide sequence variation. Although it is a very useful statistic, Zeng et al. [5] point out that there are important aspects of the data that Tajima's $D$ does not consider. As a result, it may be less powerful, i.e., less able to detect departures from neutrality, than some alternatives.


next up previous
Next: Fu's Up: Tajima's , Fu's , Previous: Tajima's , Fu's ,
Kent Holsinger 2010-12-13