Causal inference in ecology – The Rubin causal modal (part 1)
Causal inference in ecology – links to the series
Last week I described an experiment that was reasonably well controlled. We1 randomized genotypes within populations across two experimental gardens to determine whether certain leaf traits changed with the age of the plant, the garden in which they were grown or both. Interpreting the results of those experiments is reasonably straightforward. At the end, I was setting up for an exploration of what might be required to infer a causal influence of home environment on plant traits from an observed statistical association between the environment in the sites from which seeds were collected and the traits of the plants grown from those seeds in the gardens.
I’m going to shift gears, because I ran across a (non-ecological) example should be easier to understand. Putting off discussion of Protea for another week also gives me more time to get a simplified version of the data prepared and analyzed (in an R notebook). So I’m going to examine a recent report that claims evidence for the “Scully effect”, which is the claim that young women who watched Dr. Dana Scully in the X Files have a more positive impression of STEM fields, were more likely to pursue a career in STEM, and see Scully as a role model than those who didn’t watch the X Files.2
I’ll focus on only the first claim. The report3 states it more precisely this way:
Women who are medium/heavy watchers of The X-Fileshold more positive views of STEM than non/light watchers, and several survey questions link this directly to the influence of Scully’s character.
And I’ll focus on only one of the pieces of evidence in the report, namely
A greater percentage of medium/heavy viewers of The X-Files strongly believe that young women should be encouraged to study STEM than non/light viewers (56% compared to 47%).
Let’s not worry about statistics. What we have is a positive association between watching the X Files and encouraging young women to study STEM. Let’s assume that the reported difference in the sample is a reliable indication of a similar difference in the population as a whole.4 Can we conclude that watching the X Files caused that association?
The easy answer is “No”, since we all know that correlation does not imply causation, but let’s unpack that “No” and see why correlation does not imply causation. Maybe, just maybe there’s a way we can infer causation from correlation, provided we make some additional assumptions.
One of the first observations I made in this series is that causes precede effects. That condition is clearly satisfied. The women included in the survey were chosen specifically to be old enough (a) to have watched the original X Files or the current seasons and (b) “to have entered the post-college workforce.” One reason we might be skeptical of the claim “Having watched Scully increases the probability that women believe young women should be encouraged to study STEM” is that women who watched Scully may differ from women who didn’t watch Scully in ways that would have predisposed them to encourage young women to study STEM. For example, it’s reasonable to think that women who watched Scully might have already had a greater interest in science than those who didn’t, since the X Files was a science fiction drama series, not a crime fiction or other drama series.
So to conclude that a positive association between watching Scully and encouraging young women to study STEM is evidence that watching Scully causes viewers to encourage young women to study STEM, we need to know that watching Scully or not is the only relevant difference between the women included in the observational sample. If it is, then even though we didn’t design the experiment and randomly assign women to one treatment or the other, it is as if we did. This is where Rubin’s causal model comes in.5 It helps us think about how we might be able to determine that the watching and non-watching groups are equivalent (or how we might make them essentially equivalent if they aren’t already). That’s where we’ll pick up next week.6
- Remember the disclaimer about “we”. “We” really means Jane Carlson, Ann Marie Gawel, and Rachel Prunier. My role was primarily to sit on the sidelines and provide a little advice. I recorded some of the data when it was dictated to me, but Jane, Ann Marie, and Rachel did all of the real work. ↩
- In the interest of full disclosure, I should mention that I watched only a few episodes of the X Files. That won’t surprise anyone who knows me, because anyone who knows me knows that I watch very little TV. ↩
- from the Geena Davis Institute on Gender in Media ↩
- One challenge that is too often overlooked is determining what “the population as a whole” means. One of the most fundamental questions to answer in interpreting any experimental or observational result is “Over what population can the pattern I observed be generalized?” As a point of information, the authors of the report note that “All differences reported here are statistically significant at the .10 level.” ↩
- Or at least this is where it comes in for me. ↩
- Which also means that it will be at least two weeks before we get back to Protea. ↩