Uncommon Ground


Microscale trait-environment associations in Protea

If you follow me (or Nora Mitchell) on Twitter, you saw several weeks ago that a publish before print version of our most recent paper appeared in the American Joiurnal of Botany. This morning I noticed that the full published version is available on the AJB website. Here’s the citation and abstract:

Mitchell, N., and K. E. Holsinger.  2019.  Microscale trait‐environment associations in two closely‐related South African shrubs. American Journal of Botany 106:211-222.  doi: 10.1002/ajb2.1234

Premise of the Study
Plant traits are often associated with the environments in which they occur, but these associations often differ across spatial and phylogenetic scales. Here we study the relationship between microenvironment, microgeographical location, and traits within populations using co‐occurring populations of two closely related evergreen shrubs in the genus Protea.
We measured a suite of functional traits on 147 plants along a single steep mountainside where both species occur, and we used data‐loggers and soil analyses to characterize the environment at 10 microsites spanning the elevational gradient. We used Bayesian path analyses to detect trait‐environment relationships in the field for each species. We used complementary data from greenhouse grown seedlings derived from wild collected seed to determine whether associations detected in the field are the result of genetic differentiation.
Key Results
Microenvironmental variables differed substantially across our study site. We found strong evidence for six trait‐environment associations, although these differed between species. We were unable to detect similar associations in greenhouse‐grown seedlings.
Several leaf traits were associated with temperature and soil variation in the field, but the inability to detect these in the greenhouse suggests that differences in the field are not the result of genetic differentiation.

On the importance of making observations (and inferences) at the right hierarchical level

I mentioned a couple of weeks ago that trait-environment associations observed at a global scale across many lineages don’t necessarily correspond to those observed within lineages at a smaller scale (link). I didn’t mention it then, but this is just another example of the general phenomenon known as the ecological fallacy, in which associations evident at the level of a group are attributed to individuals within the group. The ecological fallacy is related to Simpson’s paradox in which within-group associations differ from those between groups.

A recent paper in Proceedings of the National Academy of Sciences gives practical examples of why it’s important to make observations at the level you’re interested in and why you should be very careful about extrapolating associations observed at one level to associations at another. They report on six repeated-measure studies in which the responses of multiple participants (87-94) 1 were assessed across time. Thus, the authors could assess both the amount of variation within individuals over time and the amount of variation among individuals at one time. They found that the amount of within individual variation was between two and four times higher than the amount of among individual variation. Why do we care? Well, if you wanted to know, for example whether administering imipramine reduced symptoms of clinical depression (sample 4 in the paper) and used the among individual variance in depression measured once to assess whether or not an observed difference was statistically meaningful, you’d be using a standard error that’s a factor of two or more too small. As a result, you’d be more confident that a difference exists than you should be based on the amount of variation within individuals.

Why does this matter to an ecologist or an evolutionary biologist? Have you ever heard of “space-time substitution”? Do a Google search and near the top you’ll find a link to this chapter from Long Term Studies in Ecology by Steward Pickett. The idea is that because longitudinal studies take a very long time, we can use variation in space as a substitute for variation in time. The assumption is rarely tested (see this paper for an exception), but it is widely used. The problem is that in any spatially structured system with a finite number of populations or sites, the variance among sites at any one time (the spatial variation we’d measure) is substantially less than the variance in any one site across time (the temporal variance). If we’re interested in the spatial variance, that’s fine. If we’re interested in how variable the system is over time, though, it’s a problem. It’s also a problem if we believe that associations we see across populations at one point in time are characteristics of any one population across time.

In the context of the leaf economic spectrum, most of the global associations that have been documented involve associations between species mean trait values. For the same reason that space-time substitution may not work and for the same reason that this recent paper in PNAS illustrates that among group associations in humans don’t reliably predict individual associations, if we want to understand the mechanistic basis of trait-environment or trait-trait associations, by which I mean the evolutionary mechanisms acting at the individual level that produce those associations within individuals, we need to measure the traits on individuals and measure the environments where those individuals occur.

Here’a the title and abstract of the paper that inspired this post. I’ve also included a link.

Lack of group-to-individual generalizability is a threat to human subjects research

Aaron J. Fisher, John D. Medaglia, and Bertus F. Jeronimus

Only for ergodic processes will inferences based on group-level data generalize to individual experience or behavior. Because human social and psychological processes typically have an individually variable and time-varying nature, they are unlikely to be ergodic. In this paper, six studies with a repeated-measure design were used for symmetric comparisons of interindividual and intraindividual variation. Our results delineate the potential scope and impact of nonergodic data in human subjects research. Analyses across six samples (with 87–94 participants and an equal number of assessments per participant) showed some degree of agreement in central tendency estimates (mean) between groups and individuals across constructs and data collection paradigms. However, the variance around the expected value was two to four times larger within individuals than within groups. This suggests that literatures in social and medical sciences may overestimate the accuracy of aggregated statistical estimates. This observation could have serious consequences for how we understand the consistency between group and individual correlations, and the generalizability of conclusions between domains. Researchers should explicitly test for equivalence of processes at the individual and group level across the social and medical sciences.

doi: 10.1073/pnas.1711978115

  1. The studies are on human subjects.

You really need to check your statistical models, not just fit them

I haven’t had a chance to read the paper I mention below yet, but it looks like a very good guide to model checking – a step that is too often forgotten. It doesn’t do us much good to estimate parameters of a statistical model that doesn’t do well at fitting the data we have. That’s what model checking is all about. In a Bayesian context, posterior predictive model checking is particularly useful.1 If the parameters and the model you used to estimate them can’t reproduce the data you collected reasonably well, the model isn’t doing a good job of fitting the data, and you shouldn’t trust the parameter estimates.

If you happen to be using Stan (via rstan) or rstanarm, posterior predictive model checking is either immediately available (rstanarm) or easy to make available (rstan) in Shinystan. It’s built on the functions in bayesplot, which provides the underlying functions for posterior prediction for virtually any package (provided you coerce the result into the right format). I’ve been using bayesplot lately, because it integrates nicely with R Notebooks, meaning that I can keep a record of my model checking in the same place that I’m developing and refining the code that I’m working on.

Here’s the title, abstract, and a link:

A guide to Bayesian model checking for ecologists

Paul B. Conn, Devin S. Johnson, Perry J. Williams, Sharon R. Melin, Mevin B. Hooten

Ecological Mongraphs doi: 10.1002/ecm.1314

Checking that models adequately represent data is an essential component of applied statistical inference. Ecologists increasingly use hierarchical Bayesian statistical models in their research. The appeal of this modeling paradigm is undeniable, as researchers can build and fit models that embody complex ecological processes while simultaneously accounting for observation error. However, ecologists tend to be less focused on checking model assumptions and assessing potential lack of fit when applying Bayesian methods than when applying more traditional modes of inference such as maximum likelihood. There are also multiple ways of assessing the fit of Bayesian models, each of which has strengths and weaknesses. For instance, Bayesian P values are relatively easy to compute, but are well known to be conservative, producing P values biased toward 0.5. Alternatively, lesser known approaches to model checking, such as prior predictive checks, cross‐validation probability integral transforms, and pivot discrepancy measures may produce more accurate characterizations of goodness‐of‐fit but are not as well known to ecologists. In addition, a suite of visual and targeted diagnostics can be used to examine violations of different model assumptions and lack of fit at different levels of the modeling hierarchy, and to check for residual temporal or spatial autocorrelation. In this review, we synthesize existing literature to guide ecologists through the many available options for Bayesian model checking. We illustrate methods and procedures with several ecological case studies including (1) analysis of simulated spatiotemporal count data, (2) N‐mixture models for estimating abundance of sea otters from an aircraft, and (3) hidden Markov modeling to describe attendance patterns of California sea lion mothers on a rookery. We find that commonly used procedures based on posterior predictive P values detect extreme model inadequacy, but often do not detect more subtle cases of lack of fit. Tests based on cross‐validation and pivot discrepancy measures (including the “sampled predictive P value”) appear to be better suited to model checking and to have better overall statistical performance. We conclude that model checking is necessary to ensure that scientific inference is well founded. As an essential component of scientific discovery, it should accompany most Bayesian analyses presented in the literature.

  1. Andrew Gelman introduced the idea more than 20 year ago (link), but it’s only really caught on since his Stan group made some general purpose packages available that simplify the process of producing the predictions. (See the next paragraph for references.)

Trait-environment relationships in Pelargonium

Almost 15 years ago Wright et al. (Nature 428:821–827; 2004 – doi: 10.1038/nature02403) described the worldwide leaf economics spectrum “a universal spectrum of leaf economics consisting of key chemical, structural and physiological properties.” Since then, an enormous number of articles have been published that examine or refer to it – more than 4000 according to Google Scholar. In the past few years, many authors have pointed out that it may not be as universal as originally presumed. For example, in Mitchell et al. (The American Naturalist 185:525-537; 2015 – http://www.jstor.org/stable/10.1086/680051) we found a negative relationship between an important component of the leaf economics spectrum (leaf mass per area) and mean annual temperature in Pelargonium from the Cape Floristic Region of southwestern South Africa, while the global pattern is for a positive relationship.1

Now Tim Moore and several of my colleagues follow up with a more detailed analysis of trait-environment relationships in Pelargonium. They demonstrate several ways in which the global pattern breaks down in South African samples of this genus. Here’s the abstract and a link to the paper.

  • Functional traits in closely related lineages are expected to vary similarly along common environmental gradients as a result of shared evolutionary and biogeographic history, or legacy effects, and as a result of biophysical tradeoffs in construction. We test these predictions in Pelargonium, a relatively recent evolutionary radiation.
  • Bayesian phylogenetic mixed effects models assessed, at the subclade level, associations between plant height, leaf area, leaf nitrogen content and leaf mass per area (LMA), and five environmental variables capturing temperature and rainfall gradients across the Greater Cape Floristic Region of South Africa. Trait–trait integration was assessed via pairwise correlations within subclades.
  • Of 20 trait–environment associations, 17 differed among subclades. Signs of regression coefficients diverged for height, leaf area and leaf nitrogen content, but not for LMA. Subclades also differed in trait–trait relationships and these differences were modulated by rainfall seasonality. Leave‐one‐out cross‐validation revealed that whether trait variation was better predicted by environmental predictors or trait–trait integration depended on the clade and trait in question.
  • Legacy signals in trait–environment and trait–trait relationships were apparently lost during the earliest diversification of Pelargonium, but then retained during subsequent subclade evolution. Overall, we demonstrate that global‐scale patterns are poor predictors of patterns of trait variation at finer geographic and taxonomic scales.


  1. If you read The American Naturalist paper, you’ll see that we wrote in the Discussion that “We could not detect a relationship between LMA and MAT in Protea….” I wouldn’t write it that way now. Look at Table 2. You’ll see that the posterior mean for the relationship is 0.135 with a 95% credible interval of (-0.078,0.340). I would now write that “We detected a weakly supported positive relationship between LMA and MAT….” Why the difference? I’ve taken to heart Andrew Gelman’s observation that “The difference between significant’ and ‘not significant’ is not itself statistically significant” (blog post; article in The American Statistician). I am training myself to pay less attention to which coefficients in a regression and which aren’t and more to reporting the best guess we have about each relationship (the posterior means) and the amount of confidence we have about them (the credible intervals). I recently learned about hypothesis() in brms, which will provide an estimate of the posterior probability that the you’ve got the sign of the relationship right. I need to investigate that. I suspect that’s what I’ll be using in the future.

Trait-climate evolution in Protea

Protea compacta

If you’re reading this post, you know that my colleagues and I have been studying Protea for more than a decade. A lot of our work has focused on documenting and understanding trait-environment associations. We’ve studied those associations both among populations within species (Protea repens: https://doi.org/10.1093/aob/mcv146), among populations within a small, closely related clade (Protea sect. Exsertae: https://doi.org/10.1111/j.1558-5646.2010.01131.x and https://doi.org/10.1111/j.1420-9101.2012.02548.x), and across the entire genus (https://doi.org/10.1086/680051). But all of those studies look at the relationship between the climate as it is now (as reflected in the South African Atlas of Agrohydrology and Climatology). They haven’t examined how traits have evolved in response to changes in climate.

Our latest paper, begins to address that shortcoming. We use the highly resolved phylogeny of Protea that Nora Mitchell constructed as part of her dissertation (http://darwin.eeb.uconn.edu/uncommon-ground/blog/2017/01/23/a-new-phylogeny-for-protea/ and https://doi.org/10.3732/ajb.1600227), and we reconstruct estimates of how traits changed over evolutionary time in concert (or not) with climates. Our reconstructions depend on particular models of evolutionary change, and we explore several alternatives. Here’s the abstract:

Evolutionary radiations are responsible for much of Earth’s diversity, yet the causes of these radiations are often elusive. Determining the relative roles of adaptation and geographic isolation in diversification is vital to understanding the causes of any radiation, and whether a radiation may be labeled as “adaptive” or not. Across many groups of plants, trait–climate relationships suggest that traits are an important indicator of how plants adapt to different climates. In particular, analyses of plant functional traits in global databases suggest that there is an “economics spectrum” along which combinations of functional traits covary along a fast–slow continuum. We examine evolutionary associations among traits and between trait and climate variables on a strongly supported phylogeny in the iconic plant genus Protea to identify correlated evolution of functional traits and the climatic-niches that species occupy. Results indicate that trait diversification in Protea has climate associations along two axes of variation: correlated evolution of plant size with temperature and leaf investment with rainfall. Evidence suggests that traits and climatic-niches evolve in similar ways, although some of these associations are inconsistent with global patterns on a broader phylogenetic scale. When combined with previous experimental work suggesting that trait–climate associations are adaptive in Protea, the results presented here suggest that trait diversification in this radiation is adaptive.

Mitchell, N., J.E. Carlson, and K.E. Holsinger.  2018.  Correlated evolution between climate and suites of traits along a fast–slow continuum in the radiation of Protea. Ecology and Evolution 8:1853–1866. doi: 10.1002/ece3.3773.

Plants, People, and the Mother City

Tanisha Williams, Fulbright 2015-2016, South Africa, at Boulders Beach visiting the penguins.

Some of you know that Carl Schlichting and I co-advise Tanisha Williams. If you know that, you almost certainly know that Tanisha spent the 2015-2016 academic year as a Fulbright Fellow in South Africa. She was based at the Cape Peninsula University of Technology, and she used her time not only to collect seeds of Pelargonium and establish experimental gardens at Kirstenbosch Botanical Garden and Rhodes University but also to work with two non-profit environmental organizations. She posted an article about her experience on the blog of the Fulbright Student Program. Here’s an excerpt to whet your appetite:

Among the many experiences I had, I must say the residents from the Khayelitsha township have taken a special place in my heart. This is where I taught girls and young women math, science, computer tutoring, life skills, and female empowerment through a community center program. It was such an impactful experience, as these girls are growing up in a community with high rates of unemployment, violence, and other socioeconomic issues. It was empowering for me to see the curiosity and determination these girls had for learning and changing their community. They thought I was there to teach them from my own experiences being raised in a comparable situation and now working on my doctorate as a scientist, but I know I was the one that gained the most from our time together. I learned what it truly means to have hope and persevere. These lessons, along with the ecological and evolutionary insights from my academic research, will be ones that I always remember.

Using weather to predict growth of forest trees

Last January I mentioned that I co-authored a paper that appeared on bioRxiv in which we combined tree ring and growth increment data to predict growth from weather and biophysical data. The paper has now appeared in Ecosphere, an open acces journal from the Ecological Society of America. Here’s the abstract. You’ll find the full citation below.

Fusing tree-ring and forest inventory data to infer influences on tree growth

Better understanding and prediction of tree growth is important because of the many ecosystem services provided by forests and the uncertainty surrounding how forests will respond to anthropogenic climate change. With the ultimate goal of improving models of forest dynamics, here we construct a statistical model that combines complementary data sources, tree-ring and forest inventory data. A Bayesian hierarchical model was used to gain inference on the effects of many factors on tree growth—individual tree size, climate, biophysical conditions, stand-level competitive environment, tree-level canopy status, and forest management treatments—using both diameter at breast height (dbh) and tree-ring data. The model consists of two multiple regression models, one each for the two data sources, linked via a constant of proportionality between coefficients that are found in parallel in the two regressions. This model was applied to a data set of ~130 increment cores and ~500 repeat measurements of dbh at a single site in the Jemez Mountains of north-central New Mexico, USA. The tree-ring data serve as the only source of information on how annual growth responds to climate variation, whereas both data types inform non-climatic effects on growth. Inferences from the model included positive effects on growth of seasonal precipitation, wetness index, and height ratio, and negative effects of dbh, seasonal temperature, southerly aspect and radiation, and plot basal area. Climatic effects inferred by the model were confirmed by a dendroclimatic analysis. Combining the two data sources substantially reduced uncertainty about non-climate fixed effects on radial increments. This demonstrates that forest inventory data measured on many trees, combined with tree-ring data developed for a small number of trees, can be used to quantify and parse multiple influences on absolute tree growth. We highlight the kinds of research questions that can be addressed by combining the high-resolution information on climate effects contained in tree rings with the rich tree- and stand-level information found in forest inventories, including projection of tree growth under future climate scenarios, carbon accounting, and investigation of management actions aimed at increasing forest resilience.

Evans, M. E. K., D. A. Falk, A. Arizpe, T. L. Swetnam, F. Babst, and K. E. Holsinger. 2017. Fusing tree-ring and forest inventory data to infer influences on tree growth. Ecosphere 8(7):e01889. doi: 10.1002/ecs2.1889

The influence of climate on tree growth

Northern Hemisphere temperature changes estimated from various proxy records shown in blue (Mann et al. 1999). Instrumental data shown in red. Note the large uncertainty (grey area) as you go further back in time.

Ecologists and paleoecologists have used the width of tree rings for years as a way of inferring past climates. In fact, tree ring data were an important component of the proxy data Mann et al. (1998) used when they constructed their famous1 hockey stick representing global surface temperatures over the last millennium. I don’t have anything as earth shattering as a hockey stick to share with you, but I am pleased to report that a paper on which I am a co-author demonstrates how to combine tree ring and growth increment data (with other data) to predict growth of forest trees. Here’s tha abstract and a link to the paper on bioRxiv.


Fusing tree-ring and forest inventory data to infer influences on tree growth

Better understanding and prediction of tree growth is important because of the many ecosystem services provided by forests and the uncertainty surrounding how forests will respond to anthropogenic climate change. With the ultimate goal of improving models of forest dynamics, here we construct a statistical model that combines complementary data sources: tree-ring and forest inventory data. A Bayesian hierarchical model is used to gain inference on the effects of many factors on tree growth (individual tree size, climate, biophysical conditions, stand-level competitive environment, tree-level canopy status, and forest management treatments) using both diameter at breast height (DBH) and tree-ring data. The model consists of two multiple regression models, one each for the two data sources, linked via a constant of proportionality between coefficients that are found in parallel in the two regressions. The model was applied to a dataset developed at a single, well-studied site in the Jemez Mountains of north-central New Mexico, U. S. A. Inferences from the model included positive effects of seasonal precipitation, wetness index, and height ratio, and negative effects of seasonal temperature, southerly aspect and radiation, and plot basal area. Climatic effects inferred by the model compared well to results from a dendroclimatic analysis. Combining the two data sources did not lead to higher predictive accuracy (using the leave-one-out information criterion, LOOIC), either when there was a large number of increment cores (129) or under a reduced data scenario of 15 increment cores. However, there was a clear advantage, in terms of parameter estimates, to the use of both data sources under the reduced data scenario: DBH remeasurement data for ~500 trees substantially reduced uncertainty about non-climate fixed effects on radial increments. We discuss the kinds of research questions that might be addressed when the high-resolution information on climate effects contained in tree rings are combined with the rich metadata on tree- and stand-level conditions found in forest inventories, including carbon accounting and projection of tree growth and forest dynamics under future climate scenarios.