One limitation of the way I've described things so far is that
doesn't provide a convenient way to compare population
structure from different samples.
can be much larger
if both alleles are about equally common in the whole sample than if
one occurs at a mean frequency of 0.99 and the other at a frequency of
0.01. Moreover, if you stare at equations (4)-(6)
for a while, you begin to realize that they look a lot like some
equations we've already
encountered.
Namely, if we were to define
6 as
, then we could rewrite equations (4)-(6) as
There may, of course, be inbreeding within populations, too. But it's
easy to incorporate this into the framework, too.8 Let
be the actual
heterozygosity in individuals within subpopulations,
be the
expected heterozygosity within subpopulations assuming Hardy-Weinberg
within populations, and
be the expected heterozygosity in the
combined population assuming Hardy-Weinberg over the whole
sample.9 Then thinking of
as a measure of departure from
Hardy-Weinberg and assuming that all populations depart from
Hardy-Weinberg to the same degree, i.e., that they all have the same
, we can define
