Next: Bibliography Up: Nested clade analysis Previous: Statistical parsimony

# Nested clade analysis

Once we have constructed the haplotype network, we're then faced with the problem of identifying nested clades. Templeton et al. [3] propose the following algorithm to construct a unique set of nested clades:

Step 1.
Each haplotype in the sample comprises a 0-step clade, i.e., each copy of a particular haplotype in the sample is separated by zero evolutionary steps from other copies of the same haplotype. ``Tip'' haplotypes are those that are connected to only one other haplotype. ``Interior'' haplotypes are those that are connected to two or more haplotypes. Set

Step 2.
Pick a tip haplotype that is not part of any -step network.

Step 3.
Identify the interior haplotype with which it is connected by mutational steps.

Step 4.
Identify all tip haplotypes connected to that interior haplotype by mutational steps.

Step 5.
The set of all such tip and interior haplotypes constitutes a -step clade.

Step 6.
If there are tip haplotypes remaining that are not part of a -step clade, return to step 2.

Step 7.
Identify internal -step clades that are not part of a step clade and are separated by steps.

Step 8.
Designate these clades as ``terminal'' and return to step 2.

Step 9.
Increment by one and return to step 2.

That sounds fairly complicated, but if you look at the example in Figure , you'll see that it isn't all that horrible.

This algorithm produces a set of nested clades, i.e., a 1-step clade is contained within a 2-step clade, a 2-step clade is contained within a 3-step clade, and so on. Once such sets of nested clades have been identified, we can calculate statistics related to the geographical distribution of each clade in the sample. Templeton et al. [6] define two statistics that are used in an inferential key (the most recent version of the key is in [4]; see Figure ):