Uncommon Ground

Monthly Archive: June 2018

Causal inference in ecology – Concluding thoughts

Causal inference in ecology – links to the series

Last week I concluded that the Rubin causal model isn’t likely to help me make causal inferences with the kinds of observational data I collect. I also argued that

It does, however, illuminate the ways in which additional data from different systems could be combined (informally) with the data I collect1 to make plausible causal inferences.

From the one data set I analyzed last week, I concluded that we could see an association between rainfall and stomata density in Protea sect. Exsertae but that we couldn’t claim (on the basis of this evidence alone) that the differences in rainfall caused differences in stomata density. Why do I claim that “additional data from different systems [can] be combined (informally) with [these] data to make plausible causal inferences”? Here’s why.

Think back to when we discussed controlled experiments. I pointed out that by randomizing individuals across treatments we statistically control for the chance that there’s some unmeasured factor that influences the results. It’s not as good as a perfectly controlled experiment in which the individuals are identical in every way except for the one factor whose causal influence we are trying to estimate, but it’s pretty good. Well, if we have a lot of observations from different systems – different taxa, different ecosystems, different climates – and we get higher stomata densities in areas with more annual rainfall, as we did in Protea sect. Exsertae, we also know that these other systems differ from Protea sect. Exsertae in many different ways in addition to those having to do with annual rainfall. That’s not as good as randomization, but it suggests that the association we saw in that small group of plants in the Cape Floristic Region is similar to associations elsewhere. That means the association is stable across a broader range of taxa or ecosystems or climates, or all three than our limited data showed, suggesting that there is a causal relationship.

Now it still doesn’t show that it’s mean annual rainfall, per se, that matters. It could still be something that’s associated with mean annual rainfall not only in the CFR but also in the other systems we studied. If we happened to find that the association always held, that it was never violated in any system we still couldn’t exclude the possibility that the “true” causal factor was this other thing we aren’t measuring, but it begins to become a bit implausible – rather like claiming that it’s not smoking that causes cancer, it’s something else that’s associated with smoking that causes cancer.2

This kind of argument doesn’t produce logical certainty, but re-read the post on falsification and you’ll see that even if a well-controlled experiment fails to give the results predicted by a hypothesis, it is very difficult to be sure that it’s the hypothesis that’s wrong. It may be that the experimental conditions don’t match those presumed by the hypothesis, in which case we can’t say anything about the truth or falsity of the hypothesis. In other words, even the classical hypothesis test can’t reject a hypothesis with certainty. There’s always judgment involved. It can’t be escaped.

Bottom line: If you’re willing to reject a hypothesis based on a failed experiment because you’re willing to examine all of the factors influencing the experimental conditions and conclude that none of them are the problem,3 you should be as willing to use evidence from a range of associational studies combined with some theory (whether a formal mathematical model or verbal description of the mechanics of a system) to build a case for a causal relationship from observational data. In neither case will you be certain of your conclusions. Your conclusions will merely be more or less plausible depending on how much and how strong your evidence is.

As scientists,4 we are more like detectives than logicians. We build cases. We don’t build syllogisms.

  1. Remember what I wrote in that last footnote.
  2. You could argue that if the two factors, the “true” causal factor and the one we measure, are invariably connected that there is really only one factor. That’s a longer philosophical discussion that I don’t have the energy to get into – at least not now.
  3. Notice that reaching this conclusion depends on your background knowledge about the system and its components, i.e., prior knowledge, not observations from the experiment itself.
  4. Or at least as ecologists and evolutionists.

Causal inference in ecology – The Rubin causal model in ecology

Causal inference in ecology – links to the series

Evaluating the claim that viewing of the X Files caused women to have more positive beliefs about science illustrated how the Rubin causal model can be used to make causal influences from observational data. The basic idea is that you make the observational sample similar to a randomized experiment by using statistical adjustments to make the “treatment” and “control” conditions as similar as possible – except for the “treatment” difference.1 Several weeks ago, I promised to describe how we might use the Rubin causal model in ecology, drawing on data from a paper in PLoS One that I’m reasonably happy with. After playing with that data a bit, I changed gears. I’m going to use data from a more recent paper (Carlson et al., Annals of Botany 117:195-207; 2016 (doi: https://dx.doi.org/10.1093/aob/mcv146).

I’ll focus on a subset of the data that explores the relationship between stomatal density of Protea repens seedlings grown in an experimental garden at Kirstenbosch National Botanical Garden and three principal components associated with the environment in the populations from which seed was collected. You’ll find the details of the analysis, an <tt>R</tt> notebook, and the data in Github. The HTML produced by the R notebook showing the results is at http://darwin.eeb.uconn.edu/pages/Protea-causal-analysis.nb.html. To run the analyses from the code you can download there, you’ll need to retrieve the CSV from Github: https://github.com/kholsinger/Protea-causal-analysis/blob/master/traits-environment-pca.csv.

Here’s the bottom line. If we run a simple regression (treating year of observation as a random effect), we get the following results for the regression coefficients:

Mean 2.5%tile 97.5%tile
PCA 1 (annual temperature) 2.422 1.597 3.216
PCA 2 (summer rainfall) -2.125 -2.980 -1.277
PCA 3 (annual rainfall) 1.317 0.538 2.099

All three principal components are strongly associated with stomatal density. We’ve all been told repeatedly that “correlation does not equal causation,” but it’s still very tempting to conclude that warmer climates favor higher stomatal densities (PCA 1), more summer rainfall favors lower stomatal densities (PCA 2), and more annual rainfall favors higher stomatal densities (PCA 3). Given what I wrote last week about the Rubin causal model, we might even feel justified in reaching this conclusion, since we’ve statistically controlled for relevant differences among populations (other than those that we measured). But go back and read that post again, and pay particular attention to this sentence:

The degree to which you can be confident in your causal inference depends (a) on how well you’ve done at identifying and measuring plausible causal factors and (b) how closely your two groups are matched on those other causal factors.

Notice (a) in particular. We have good evidence for the associations noted above,2 but the principal components we identified were based on only 7 environmental descriptors, six from the South African Atlas of Agrohydrology and Climatology and elevation (from a NASA digital elevation model). There could easily be other environmental factors correlated with one (or all) of the principal components we identified that drive the association we observe. Now if similar associations had been observed in worldwide datasets involving many different groups of plants, it might not unreasonable to conclude that there is a causal relationship between the principal components we analyzed and stomatal density, but that conclusion wouldn’t be based solely on the data and analysis here. It would depend on seeing the same pattern repeatedly in different contexts, which gives us something analogous to haphazard (not random) assignment to experimental conditions.

There is, however, a further caveat.

In Carlson et al., we obtained the following results for the mean and 95% credible interval on the association between stomatal density and each of the three principal component axes:

Mean 2.5%tile 97.5%tile
PCA 1 (annual temperature) 0.258 0.077 0.441
PCA 2 (summer rainfall) -0.216 -0.394 -0.040
PCA 3 (annual rainfall) 0.155 -0.043 0.349

Don’t worry about the difference in magnitude of the coefficients. In Carlson et al. we transformed the response variables to a mean of 0 and a standard deviation of 1 before the analysis. Focus on the credible intervals. Here the credible interval for PCA 3 overlaps zero. In a conventional interpretation, we’d say that we don’t have evidence for a relationship between annual rainfall and stomatal density. 3I’d prefer to say that the relationship with annual rainfall appears to be positive, but the evidence is weaker than for the relationships with annual temperature or summer rainfall. However you say it though, there seems to be a difference in the results. Why would that be?

Because in Carlson et al. we analyzed stomatal density as one of a suite of leaf traits (length-width ratio, stomatal density, stomatal pore index, specific leaf area, and leaf area) that are correlated with one another. In particular, leaf area and stomatal density are associated with one another, perhaps because of the way that leaves develop. Leaf area is associated with annual rainfall. Thus, the association between leaf area and stomatal density intensifies the observed relationship between annual rainfall and stomatal density.

In short, we should modify that sentence from last week to add a condition (c):

The degree to which you can be confident in your causal inference depends (a) on how well you’ve done at identifying and measuring plausible causal factors, (b) how closely your two groups are matched on those other causal factors, and (c) whether or not your response variable is associated with something else (measured or not) that is influenced by the causal factors you’re studying.

Bottom line: For the types of observations I make4 the Rubin causal model doesn’t seem likely to help me make causal inferences. It does, however, illuminate the ways in which additional data from different systems could be combined (informally) with the data I collect5 to make plausible causal inferences. At least they should be plausible enough to motivate careful experimental or observational tests of those inferences (if the causal processes are interesting enough to warrant those tests).

  1. Implementing this approach in analysis of a real data set can become very complicated. There’s a large literature on the Rubin causal model in social science. I’ve read almost none of it. What I’ve learned about the Rubin causal model comes from reading Gelman and Hill’s regression modeling book and from reading Imbens and Rubin.
  2. That’s overstating it a bit. See the discussion that follows this paragraph.
  3. There are serious problems with this kind of interpretation. See Andrew Gelman’s post explaining why “the difference between ‘significant’ and ‘not significant’ is not itself statistically significant.
  4. Remember, when I write “I make” I really mean “my students, postdocs, and collaborators make.” I just follow along and help with the statistics.
  5. Remember what I wrote in that last footnote.

Causal inference in ecology – The Rubin causal model (part 2)

Causal inference in ecology – links to the series

Last week I described a straightforward example of why inferring a causal relationship from an observed association can be problematic. The authors of the study on the “Scully effect” are mostly pretty careful to write things like “regular viewers of The X-Files have far more positive beliefs about STEM than other women in the sample” rather than claiming that viewing of the X Files caused women to have more positive beliefs about STEM. In the end, though, they can’t help themselves:

The findings of this study confirm what previous research has established, that entertainment media is influential in shaping life choices.

As I pointed out last time, in order to make that claim from these data, we’d need to know that there wasn’t already a difference between women in the sample that caused women with positive beliefs about STEM to watch the X Files more often than other women.

So let’s suppose that in addition to asking women in their sample (a) whether they had watched the X Files and (b) whether they had a positive beliefs about STEM they had also asked them (c) how many courses in science and math they took during junior high and high school. Then a statistical model describing the data they collected would look like this:

\(y_i = \alpha_{treat[i]} + \beta x_i \\\)

where yi is a measure of positive belief for individual i,1 αtreat[i] is an indicator variable that denotes whether or not the individual was part of the treatment (watching the X Files ),2 β is a regression coefficient indicating the amount that taking once science or math course affects the measure of positive belief, and xi; is the number of science or math courses that individual i took. If αt > αc;, then we have some evidence that watching the X Files causally contributes to more positive impressions of stem in women.3

This approach only works, though, if the range in number of science courses taken by the two groups of women is roughly the same. If all of the women who watched the X Files took more science courses than any of the women who didn’t, we couldn’t tell whether the difference in their positive impressions was due to watching the X Files or to taking more science courses (or to the personality traits that caused them to take more science courses).

That’s the basic idea behind the Rubin causal model: Identify all of the factors that might reasonably influence the outcome of interest, include those factors in an analysis of covariance (or something similar), and infer a causal effect of the difference between two groups if there’s an effect of the grouping variable after controlling for all of the other factors and if the groups broadly overlap on other potential causal factors. The degree to which you can be confident in your causal inference depends (a) on how well you’ve done at identifying and measuring plausible causal factors and (b) how closely your two groups are matched on those other causal factors. Matching here plays the same conceptual role as randomization in a controlled experiment.

  1. Where I assume that larger values correspond to more positive beliefs.
  2. Notice that the subscript on α will only take two values. I’ll denote them αc and αt for “control” and “treatment”, respectively.
  3. Provided we’re willing to extrapolate from our sample to women in general, or at least to women in the US.