Next: Bibliography
Up: Nested clade analysis
Previous: Statistical parsimony
Once we have constructed the haplotype network, we're then faced with
the problem of identifying nested clades. Templeton et
al. [3] propose the following algorithm to
construct a unique set of nested clades:
- Step 1.
- Each haplotype in the sample comprises a 0-step clade,
i.e., each copy of a particular haplotype in the sample is separated
by zero evolutionary steps from other copies of the same
haplotype. ``Tip'' haplotypes are those that are connected to only
one other haplotype. ``Interior'' haplotypes are those that are
connected to two or more haplotypes. Set
- Step 2.
- Pick a tip haplotype that is not part of any
-step
network.
- Step 3.
- Identify the interior haplotype with which it is
connected by
mutational steps.
- Step 4.
- Identify all tip haplotypes connected to that interior
haplotype by
mutational steps.
- Step 5.
- The set of all such tip and interior haplotypes
constitutes a
-step clade.
- Step 6.
- If there are tip haplotypes remaining that are not part
of a
-step clade, return to step 2.
- Step 7.
- Identify an internal
-step clades that are not part
of a
step clade and are separated by
steps.
- Step 8.
- Designate these clades as ``terminal'' and return to
step 2.
- Step 9.
- Increment
by one and return to step 2.
That sounds fairly complicated, but if you look at the
example in Figure 3, you'll see that it isn't all
that horrible.
Figure 3:
Nesting of haplotypes at the Adh locus in Drosophila melanogaster.
|
|
This algorithm produces a set of nested clades, i.e., a 1-step clade
is contained within a 2-step clade, a 2-step clade is contained within
a 3-step clade, and so on. One such sets of nested clades have been
identified, we can calculate statistics related to the geographical
distribution of each clade in the sample. Templeton et
al. [6] define two statistics that are used in
an inferential key (the most recent version of the key is
in [4]; see Figure 4):
- Clade distance
- The average distance of each haplotype in the
the particular clade from the center of its geographical
distribution. ``Distance'' may be the great circle distance or it
might be the distance measured along a presumed dispersal
corridor. The clade distance for clade
is symbolized
,
and it measures how far this clade has spread.
- Nested clade distance
- The average distance of the center of
distribution for this haplotype from the center of distribution for
the haplotype within which it is nested. So if clade
is nested
within clade
, we calculate
by determinining the
geographic center of clades
and clade
and measuring the
distance between those centers.
measures how far the clade
has changed position relative to the clade from which it originated.
Figure 4:
Each number corresponds to a haplotype in the
sample. Haplotypes 1 and 2 are ``tip'' haplotypes. Haplotype 3 is an
interior haplotype. The numbers in square boxes illustrate the
center for each 0-step clade (a haplotype). The hexagonal ``N''
represents the center for the clade containing 1, 2, and 3. Numbers
in ovals are the distances from the center of each collecting area
to the clade center.
,
,
.
,
,
.
|
|
Once you've calculated these distances, you randomly permute the
clades across sample locations. This shuffles the data randomly,
keeping the number of haplotypes and the sample size per location the
same as in the orignal data set. For each of these permutations, you
calculate
and
. If the observed clade distance, the
observed nested clade difference, or both are significantly different
from expected by chance, then you have evidence of (a) geographical
expansion of the clade (if
is greater than null expectation)
or (b) a range-shift (if
is greater than null
expectation). Using these kinds of statistics, you run your data set
through Templeton's inference key to reach a conclusion. For example,
applying this procedure to data from Ambystoma tigrinum (Figure
5), Templeton et al. [6]
construct the scenario in Figure 6.
Figure 5:
Geographic distribution of mtDNA haplotypes in Ambystoma
tigrinum.
|
|
Figure 6:
Inference key for Ambystoma
tigrinum.
|
|
Next: Bibliography
Up: Nested clade analysis
Previous: Statistical parsimony
Kent Holsinger
2008-11-25