Uncommon Ground


Causes of genetic differentiation in Protea repens

American Journal of Botany Volume 104, Number 5. May 2017.

Protea repens is the most widespread member of the genus. It was one of the focal species in our recently completed Dimensions of Biodiversity project. Part of the project involved genotyping-by-sequencing analyses of 663 individuals from 19 populations spanning most of the geographical range of the species. We summarize results of those analyses in a paper that just appeared in advance of the May issue (cover photo featured above) of the American Journal of Botany. Here’s the abstract. You’ll find the citation and a link at the bottom.

PREMISE OF THE STUDY: The Cape Floristic Region (CFR) of South Africa is renowned for its botanical diversity, but the evolutionary origins of this diversity remain controversial. Both neutral and adaptive processes have been implicated in driving diversification, but population-level studies of plants in the CFR are rare. Here, we investigate the limits to gene flow and potential environmental drivers of selection in Protea repens L. (Proteaceae L.), a widespread CFR species.
METHODS: We sampled 19 populations across the range of P. repens and used genotyping by sequencing to identify 2066 polymorphic loci in 663 individuals. We used a Bayesian FST outlier analysis to identify single-nucleotide polymorphisms (SNPs) marking genomic regions that may be under selection; we used those SNPs to identify potential drivers of selection and excluded them from analyses of gene flow and genetic structure.
RESULTS: A pattern of isolation by distance suggested limited gene flow between nearby populations. The populations of P. repens fell naturally into two or three groupings, which corresponded to an east-west split. Differences in rainfall seasonality contributed to diversification in highly divergent loci, as do barriers to gene flow that have been identified in other species.
CONCLUSIONS: The strong pattern of isolation by distance is in contrast to the findings in the only other widespread species in the CFR that has been similarly studied, while the effects of rainfall seasonality are consistent with well-known patterns. Assessing the generality of these results will require investigations of other CFR species.

Prunier, R., M. Akman, C.T. Kremer, N. Aitken, A. Chuah, J. Borevitz, and K. E. Holsinger. Isolation by distance and isolation by environment contribute to population differentiation in Protea repens (Proteaceae L.), a widespread South African species. American Journal of Botany doi: 10.3732/ajb.1600232 

Happy Birthday Sir David Attenborough

Wildscreen’s photograph of David Attenborough at ARKive’s launch in Bristol, England © May 2003 You are free: to share – to copy, distribute and transmit the work to remix – to adapt the work

Today is Sir David Attenborough’s 91st birthday. If you follow me on Twitter or read this blog, you don’t need me to tell you who he is, but just as a reminder, here is some of his biography from IMDb:

Born 8 May 1926, the younger brother of actor Lord Richard Attenborough. He never expressed a wish to act and, instead, studied Natural Sciences at Cambridge University, graduating in 1947, the year he began his two years National Service in the Royal Navy. In 1952, he joined BBC Television at Alexandra Palace and, in 1954, began his famous “Zoo Quest” series. When not “Zoo Questing”, he presented political broadcasts, archaeological quizzes, short stories, gardening and religious programmes. 1964 saw the start of BBC2, Britain’s third TV channel, with Michael Peacock as its Controller. A year later, Peacock was promoted to BBC1 and Attenborough became Controller of BBC2. As such, he was responsible for the introduction of colour television into Britain, and also for bringing Monty Python’s Flying Circus (1969) to the world. In 1969, he was appointed Director of Programmes with editorial responsibility for both the BBC’s television networks. Eight years behind a desk was too much for him, and he resigned in 1973 to return to programme making. First came “Eastwards with Attenborough”, a natural history series set in South East Asia, then “The Tribal Eye”, examining tribal art. In 1979, he wrote and presented all 13 parts of Life on Earth (1979) (then the most ambitious series ever produced by the BBC Natural History Unit). This became a trilogy, with The Living Planet (1984) and The Trials of Life (1990).

I knew about Life on Earth, The Living Planet, and The Trials of Life (obviously). I didn’t know that he’d introduced color TV to Britain and that he was responsible for “brining Monty Python’s Flying Circus to the world. What an amazing set of accomplishments. His contributions are simply astounding.

More at Wikipedia and Biography

The beauty of fynbos

The beauty of our fynbos from CapeNature on Vimeo.

In case you’ve ever wondered why I have spent so much time working in, thinking about, and writing about Protea this video from CapeNature will give you a bit of a clue. The fynbos is a very interesting place. It has an enormous diversity of plants, many of which are found nowhere else in the world, and much of that diversity is concentrated in a relatively small number of big evolutionary radiations, one of which is Protea.1 One of my students,

Kristen Nolting (@KristenNolting on Twitter) pointed me to this video. Thanks, Kristen.


A new phylogeny for Protea

Protea compacta

Protea compacta near Kleinmond, Western Cape, South Africa

The genus Protea is one of the iconic evolutionary radiations in the Greater Cape Floristic Region of southwestern South Africa. Its range extends north through Mozambique into parts of central Africa, but the vast majority of species are found in South Africa. From 2011-2014 we collected samples from most of the South African species (59 in total), and for most of the species we collected samples from several individuals from different populations. Over the last couple of years, we extracted DNA, built libraries for next generation sequencing using targeted phylogenomics, and constructed a highly-resolved estimate of phylogenetic relationships in the genus. The paper describing our results is now out in “early view” in American Journal of Botany. Most species from which we have multiple samples are supported as monophyletic units, and most relationships we identify are strongly supported (> 90% support in ASTRAL-II and SVDquartets analyses). We use the species tree from our data as a backbone to provide reliable estimates of relationship for additional species included in a paper by Schnitzler and colleagues for which we did not have samples.

Mitchell, N., P.O. Lewis, E.M. Lemmon, A.R. Lemmon, and K.E. Holsinger.  2017.  Anchored phylogenomics improves the resolution of evolutionary relationships in the rapid radiation of Protea L. American Journal of Botany doi: 10.3732/ajb.1600227

The influence of climate on tree growth

Northern Hemisphere temperature changes estimated from various proxy records shown in blue (Mann et al. 1999). Instrumental data shown in red. Note the large uncertainty (grey area) as you go further back in time.

Ecologists and paleoecologists have used the width of tree rings for years as a way of inferring past climates. In fact, tree ring data were an important component of the proxy data Mann et al. (1998) used when they constructed their famous1 hockey stick representing global surface temperatures over the last millennium. I don’t have anything as earth shattering as a hockey stick to share with you, but I am pleased to report that a paper on which I am a co-author demonstrates how to combine tree ring and growth increment data (with other data) to predict growth of forest trees. Here’s tha abstract and a link to the paper on bioRxiv.


Fusing tree-ring and forest inventory data to infer influences on tree growth

Better understanding and prediction of tree growth is important because of the many ecosystem services provided by forests and the uncertainty surrounding how forests will respond to anthropogenic climate change. With the ultimate goal of improving models of forest dynamics, here we construct a statistical model that combines complementary data sources: tree-ring and forest inventory data. A Bayesian hierarchical model is used to gain inference on the effects of many factors on tree growth (individual tree size, climate, biophysical conditions, stand-level competitive environment, tree-level canopy status, and forest management treatments) using both diameter at breast height (DBH) and tree-ring data. The model consists of two multiple regression models, one each for the two data sources, linked via a constant of proportionality between coefficients that are found in parallel in the two regressions. The model was applied to a dataset developed at a single, well-studied site in the Jemez Mountains of north-central New Mexico, U. S. A. Inferences from the model included positive effects of seasonal precipitation, wetness index, and height ratio, and negative effects of seasonal temperature, southerly aspect and radiation, and plot basal area. Climatic effects inferred by the model compared well to results from a dendroclimatic analysis. Combining the two data sources did not lead to higher predictive accuracy (using the leave-one-out information criterion, LOOIC), either when there was a large number of increment cores (129) or under a reduced data scenario of 15 increment cores. However, there was a clear advantage, in terms of parameter estimates, to the use of both data sources under the reduced data scenario: DBH remeasurement data for ~500 trees substantially reduced uncertainty about non-climate fixed effects on radial increments. We discuss the kinds of research questions that might be addressed when the high-resolution information on climate effects contained in tree rings are combined with the rich metadata on tree- and stand-level conditions found in forest inventories, including carbon accounting and projection of tree growth and forest dynamics under future climate scenarios.

Don’t overinterpret STRUCTURE plots

Screen Shot 2016-08-21 at 4.11.10 PM
Several weeks ago1 Daniel Falush (@DanielFalush) posted a preprint on bioRxiv, “A tutorial on how (not) to over-interpret STRUCTURE/ADMIXTURE bar plots”. I finally had a chance to read it this weekend. Here’s the abstract:

Genetic clustering algorithms, implemented in popular programs such as STRUCTURE and ADMIXTURE, have been used extensively in the characterisation of individuals and populations based on genetic data. A successful example is reconstruction of the genetic history of African Americans who are a product of recent admixture between highly differentiated populations. Histories can also be reconstructed using the same procedure for groups which do not have admixture in their recent history, where recent genetic drift is strong or that deviate in other ways from the underlying inference model. Unfortunately, such histories can be misleading. We have implemented an approach (available at www.paintmychromsomes.com) to assessing the goodness of fit of the model using the ancestry ‘palettes’ estimated by CHROMOPAINTER and apply it to both simulated and real examples. Combining these complementary analyses with additional methods that are designed to test specific hypothesis allows a richer and more robust analysis of recent demographic history based on genetic data.

A key observation Falush and his co-authors make is that different demographic scenarios can lead to the same STRUCTURE diagram. They illustrate three different scenarios. In all of them, they simulate data from 12 populations but sample from only four of them. In all of the scenarios, population P4 has been isolated from the other three populations in the sample for a long time. It’s the relationship between P1, P2, and P3 that differs among the scenarios.

  • Recent admixture: P1 and P3 have also been distinct for some time, and P2 is a recent admixture of P1, P3, and P4.
  • Ghost admixture: P1 and P3 diverged some time ago, and P2 is a recent admixture of P1 and a “ghost” population more closely related to P3 than to P1.
  • Recent bottleneck: P1 is sister to P2 but underwent a strong recent bottleneck.

Screen Shot 2016-08-21 at 4.19.59 PM

As you can see, the STRUCTURE diagrams estimated from data simulated in each scenario are indistinguishable. They also show that if you have additional data available, specifically if you are lucky enough to be working in an organism with a lot of SNPs that are mapped, then you can combine estimates from CHROMOPAINTER with those from STRUCTURE to distinguish the recent admixture scenario from the other two – assuming that you’ve picked a reasonable number for K, the number of subpopulations.2

The authors also refer to Puechmaille’s recent work demonstrating that estimates of genetic structure are greatly affected by sample size. Bottom line: Read both this paper and Puechmaille’s if you use STRUCTURE, tread cautiously when interpreting results, and don’t expend too much effort trying to estimate the “right” K.

1OK, as you can see from the tweet, it was almost a month ago.

2The paper contains a brief remark about how hard it is to estimate K: “Unless the demographic history of the sample is particularly simple, the value of K inferred according to any statistically sensible criterion is likely to be smaller than the number of distinct drift events that have significantly impacted the sample. What the algorithm often does is in practice use variation in admixture proportions between individuals to approximately mimic the effect of more than K distinct drift events without estimating ancestral populations corresponding to each one.”

Falush, D., L. van Dorp, D. Lawson. 2016. A tutorial on how (not) to over-interpret STRUCTURE/ADMIXTURE bar plots. bioRxiv doi: 10.1101/066431
Lawson, D.J., G. Hellenthal, S. Myers, and D. Falush. 2012. Inference of population structure using dense haplotype data. PLoS Genetics 8:e1002453. doi: 10.1371/journal.pgen.1002453
Puechmaille, S.J. 2016. The program structure does not reliably recover the correct population structure when sampling is uneven: subsampling and new estimators alleviate the problem. Molecular Ecology Resources 16:608-627. doi: 10.1111/1755-0998.12512

Summary of tweeting from #Botany2016

Twitter activity for #Botany2016 has declined now that the conference has been over for a couple of days.


Spirts remained high throughout the runup to the conference, dipping below zero only once about a week before everyone arrived.


@JChrisPires contributed a larger number of tweets (including tweets of others that he retweeted) than anyone else,


but @uribe_convers had a larger impact, regardless of whether you measure impact in number of retweets


or in terms of number of likes


If you’d like to play around with the code, it’s available in Github: https://github.com/kholsinger/Twitter-stats.