One of the challenges we always face is figuring out what patterns make sense. Take a bunch of nucleotide sequence data from a bunch of different nuclear genes within a large population of an outcrossing species, throw it into your favorite phylogeny program, and you'll get out a tree, one tree1 -- even if each gene has a different evolutionary history.2 Or take a bunch of data from a single gene and a bunch of different populations and throw it into the same programs, and you'll get out a tree -- even if the populations show a linear cline or isolation by distance.
Wouldn't it be great if you could throw your data into a program and have it figure out whether a tree is the best way to structure your data or if some linear order or a dominance hierarchy or something else made more sense? Well, hang on to your hats. There's a recent paper suggesting that it might just be possible.
I'm no expert on the Supreme Court, but the placement of the various justices on that spectrum looks pretty reasonable to me. The animal tree has a problem, since it puts birds closer to mammals than to alligators and iguanas. But alligators and iguanas are the only tetrapods in the tree other than birds and mammals, and there are only two fish. So the problem may be with the long branches in the middle of the tree rather than a fundamental problem with the approach. Moreover, the data used to generate the tree consisted of 106 binary characters like "perceptual features (is black), anatomical features (has feet), ecological features (lives in the ocean), and behavioral features (makes loud noises)." Given that none of those characters are good homologies, it's surprising how well the result turns out.
All of the datasets and code for running the model are available from http://charleskemp.com. I can see myself spending some time playing around with this in the not too distant future. The results look very promising.
1Even if you throw it into Mr. Bayes, chances are you'll focus on the majority-rule consensus tree as your one best guess for the evolutionary history.
2In a large population of an outcrossing species different (unlinked) genes will have different evolutionary histories because the process of drift happens independently at each locus, leading to different coalescent histories.
Holyoak, K.J. (2008). Induction as model selection. Proceedings of the National Academy of Sciences, 105(31), 10637-10638. DOI: 10.1073/pnas.0805910105
Kemp, C., Tenenbaum, J.B. (2008). From the Cover: The discovery of structural form. Proceedings of the National Academy of Sciences, 105(31), 10687-10692. DOI: 10.1073/pnas.0802631105
Leave a comment