Analyzing the genetic structure of populations: F-statistics
We'll start by reviewing a couple of definitions of FST and reviewing how we can use F-statistics to partition genetic diversity into differences among individuals within populations and differences among populations. If you've encountered a nested analysis of variance before, this will seem very similar. (In fact, we'll see that one way of estimating F-statistics from data is to give each allele type a number and to perform an analysis of variance on those numbers.) We'll also quickly review the ideas of statistical expectation and unbiased estimators.
Once we've finished that, we're ready to rumble. We'll talk about two methods for estimating FST from data. I'll explain why Weir and Cockerham's approach is preferable to the one Nei developed. WARNING: The reasons Weir and Cockerham's approach is preferable are going to seem pretty picayune, because the reasons involve some distinctions many people fail to make. The distinctions are, however, important and fundamental, making Weir and Cockerham's approach the clear chocie.
Next time (or maybe at the end of the day today if things go really well), I'll introduce a Bayesian approach. To me it's an even better approach than Weir and Cockerham. Unfortunately, the software I wrote to implement it is no longer compatible with modern operating systems, and I haven't had a chance to update it. Fortunately, for moderately large data sets, the estimates obtained from using Weir and Cockerham's approach are virtually indistinguishable from those obtained from a Bayesian analysis.
By the way, if the notes for today's lecture look familiar, that's because they're the same notes that I linked to for last Thursday's lecture.