Next: Divergence of nucleotide sequences
Up: Types of data
Previous: The physical basis of
The diversity of laboratory techniques used to reveal molecular
variation is even greater than the diversity of underlying physical
structures. Various techniques involving direct measurement of aspects
of DNA sequence variation are by far the most common today, so I'll
mention only the techniques that have been most widely used.
- Immunological distance
- Some molecules, notably protein
molecules, induce an immune response in common laboratory
mammals. The extent of cross-reactivity between an antigen raised to
humans and chimps, for example, can be used as a measure of
evolutionary distance. The immunological distance between humans and
chimps is smaller than it is between humans and orangutans,
suggesting that humans and chimps share a more recent common
ancestor.
- DNA-DNA hybridization
- Once repetitive sequences of DNA have
been ``subtracted out'',4 the rate and temperature at
which DNA species from two different species anneal reflects the
average percent sequence divergence between them. The percent
sequence divergence can be used as a measure of evolutionary
distance. Immunological distances and DNA-DNA hybridization were
used primarily to identify phylogenetic relationships among
species. Neither is now widely used in molecular evolution studies.
- Isozymes
- Biochemists recognized in the late 1950s that many
soluble enzymes occurred in multiple forms within a single
individual. Population genetics, notably Hubby and Lewontin, later
recognized that in many cases, these different forms corresponded to
different alleles at a single locus, allozymes. Allozymes are
relatively easy to score in most macroscopic organisms, they are
typically co-dominant (the allelic composition of heterozygotes can
be inferred), and they allow investigators to identify both variable
and non-variable loci.5 Patterns
of variation at allozyme loci may not be representative of genetic
variation that does not result from differences in protein structure
or that are related to variation in proteins that are insoluble.
- RFLPs
- In the 1970s molecular geneticists discovered restriction
enzymes, enzymes that cleave DNA at specific 4, 5, or 6 base pair
sequences, the recognition site. A single nucleotide change in
a recognition site is usually enough to eliminate it. Thus, presence
or absence of a restriction site at a particular position in a
genome provides compelling evidence of an underlying difference in
nucleotide sequence at that positon.
- RAPDs, AFLPs, ISSRs
- With the advent of the polymerase chain
reaction in the late 1980s, several related techniques for the rapid
assessment of genetic variation in organisms for which little or no
prior genetic information was available. These methods differ in
details of how the laboratory procedures are performed, buty they
are similar in that they (a) use PCR to amplify anonymous stretches
of DNA, (b) generally produce larger amounts of variation than
allozyme analyses of the same taxa, and (c) are bi-allelic, dominant
markers. They have the advantage, relative to allozymes, that they
sample more or less randomly through the genome. They have the
disadvantage that heterozygotes cannot be distinguished from
dominant homozygotes, meaning that it is difficult to use them to
obtain information about levels of within population
inbreeding.6
- Microsatellites
- Satellite DNA, highly repetitive DNA associated
with heterochromatin, had been known since biochemists first began
to characterize the large-scale structure of genomes in DNA-DNA
hybridization studies. In the mid-late 1980s several investigators
identified smaller repetitive units dispersed throughout many
genomes. Microsatellites, which consist of short (2-6) nucleotide
sequences repeated many times, have proven particularly useful for
analyses of variation within populations since the
mid-1990s. Because of high mutation rates at each locus, they
commonly have many alleles. Moreover, they are typically
co-dominant, making them more generally useful than dominant
markers. Identifying variable microsatellite loci is more laborious
than identifying AFLPs, RAPDs, or ISSRs.
- Nucleotide sequence
- The advent of automated sequencing has
greatly increased the amount of population-level data available on
nucleotide sequences. Nucleotide sequence data has an important
advantage over most of the types of data discussed so far:
allozymes, RFLPs, AFLPs, RAPDs, and ISSRs may all hide
variation. Nucleotide sequence differences need not be reflected in
any of those markers. On the other hand, each of those markers
provides information on variation at several or many, independently
inherited loci. Nucleotide sequence information reveals differences
at a location that rarely extends more than 2-3kb. Of course, as
next generation sequencing techniques become less expensive and more
widely available, we will see more and more examples of nucleotide
sequence variation from many loci within individuals.
- Single nucleotide polymorphisms
- In organisms that are
genetically well-characterized it may be possible to identify a
large number of single nucleotide positions that harbor
polymorphisms. These SNPs potentially provide high-resolution
insight into patterns of variation within the genome. For example,
the HapMap project has identified approximately 3.2M SNPs in the
human genome, or about one every kb [1].
As you can see from these brief descriptions, each of the markers
reveals different aspects of underlying hereditary differences among
individuals, populations, or species. There is no single ``best''
marker for evolutionary analyses. Which is best depends on the
question you are asking. In many cases in molecular evolution, the
interest is intrinsically in the evolution of the molecule itself, so
the choice is based not on what those molecules reveal about the
organism that contains them but on what questions about which
molecules are the most interesting.
Next: Divergence of nucleotide sequences
Up: Types of data
Previous: The physical basis of
Kent Holsinger
2010-12-13