Uncommon Ground

Statistics

Causal inference in ecology – The Rubin causal model (part 2)

Causal inference in ecology – links to the series

Last week I described a straightforward example of why inferring a causal relationship from an observed association can be problematic. The authors of the study on the “Scully effect” are mostly pretty careful to write things like “regular viewers of The X-Files have far more positive beliefs about STEM than other women in the sample” rather than claiming that viewing of the X Files caused women to have more positive beliefs about STEM. In the end, though, they can’t help themselves:

The findings of this study confirm what previous research has established, that entertainment media is influential in shaping life choices.

As I pointed out last time, in order to make that claim from these data, we’d need to know that there wasn’t already a difference between women in the sample that caused women with positive beliefs about STEM to watch the X Files more often than other women.

So let’s suppose that in addition to asking women in their sample (a) whether they had watched the X Files and (b) whether they had a positive beliefs about STEM they had also asked them (c) how many courses in science and math they took during junior high and high school. Then a statistical model describing the data they collected would look like this:

\(y_i = \alpha_{treat[i]} + \beta x_i \\\)

where yi is a measure of positive belief for individual i,1 αtreat[i] is an indicator variable that denotes whether or not the individual was part of the treatment (watching the X Files ),2 β is a regression coefficient indicating the amount that taking once science or math course affects the measure of positive belief, and xi; is the number of science or math courses that individual i took. If αt > αc;, then we have some evidence that watching the X Files causally contributes to more positive impressions of stem in women.3

This approach only works, though, if the range in number of science courses taken by the two groups of women is roughly the same. If all of the women who watched the X Files took more science courses than any of the women who didn’t, we couldn’t tell whether the difference in their positive impressions was due to watching the X Files or to taking more science courses (or to the personality traits that caused them to take more science courses).

That’s the basic idea behind the Rubin causal model: Identify all of the factors that might reasonably influence the outcome of interest, include those factors in an analysis of covariance (or something similar), and infer a causal effect of the difference between two groups if there’s an effect of the grouping variable after controlling for all of the other factors and if the groups broadly overlap on other potential causal factors. The degree to which you can be confident in your causal inference depends (a) on how well you’ve done at identifying and measuring plausible causal factors and (b) how closely your two groups are matched on those other causal factors. Matching here plays the same conceptual role as randomization in a controlled experiment.

  1. Where I assume that larger values correspond to more positive beliefs.
  2. Notice that the subscript on α will only take two values. I’ll denote them αc and αt for “control” and “treatment”, respectively.
  3. Provided we’re willing to extrapolate from our sample to women in general, or at least to women in the US.

Causal inference in ecology – The Rubin causal modal (part 1)

Causal inference in ecology – links to the series

Last week I described an experiment that was reasonably well controlled. We1 randomized genotypes within populations across two experimental gardens to determine whether certain leaf traits changed with the age of the plant, the garden in which they were grown or both. Interpreting the results of those experiments is reasonably straightforward. At the end, I was setting up for an exploration of what might be required to infer a causal influence of home environment on plant traits from an observed statistical association between the environment in the sites from which seeds were collected and the traits of the plants grown from those seeds in the gardens.

I’m going to shift gears, because I ran across a (non-ecological) example should be easier to understand. Putting off discussion of Protea for another week also gives me more time to get a simplified version of the data prepared and analyzed (in an R notebook). So I’m going to examine a recent report that claims evidence for the “Scully effect”, which is the claim that young women who watched Dr. Dana Scully in the X Files have a more positive impression of STEM fields, were more likely to pursue a career in STEM, and see Scully as a role model than those who didn’t watch the X Files.2

I’ll focus on only the first claim. The report3 states it more precisely this way:

Women who are medium/heavy watchers of The X-Fileshold more positive views of STEM than non/light watchers, and several survey questions link this directly to the influence of Scully’s character.

And I’ll focus on only one of the pieces of evidence in the report, namely

A greater percentage of medium/heavy viewers of The X-Files strongly believe that young women should be encouraged to study STEM than non/light viewers (56% compared to 47%).

Let’s not worry about statistics. What we have is a positive association between watching the X Files and encouraging young women to study STEM. Let’s assume that the reported difference in the sample is a reliable indication of a similar difference in the population as a whole.4 Can we conclude that watching the X Files caused that association?

The easy answer is “No”, since we all know that correlation does not imply causation, but let’s unpack that “No” and see why correlation does not imply causation. Maybe, just maybe there’s a way we can infer causation from correlation, provided we make some additional assumptions.

One of the first observations I made in this series is that causes precede effects. That condition is clearly satisfied. The women included in the survey were chosen specifically to be old enough (a) to have watched the original X Files or the current seasons and (b) “to have entered the post-college workforce.” One reason we might be skeptical of the claim “Having watched Scully increases the probability that women believe young women should be encouraged to study STEM” is that women who watched Scully may differ from women who didn’t watch Scully in ways that would have predisposed them to encourage young women to study STEM. For example, it’s reasonable to think that women who watched Scully might have already had a greater interest in science than those who didn’t, since the X Files was a science fiction drama series, not a crime fiction or other drama series.

So to conclude that a positive association between watching Scully and encouraging young women to study STEM is evidence that watching Scully causes viewers to encourage young women to study STEM, we need to know that watching Scully or not is the only relevant difference between the women included in the observational sample. If it is, then even though we didn’t design the experiment and randomly assign women to one treatment or the other, it is as if we did. This is where Rubin’s causal model comes in.5 It helps us think about how we might be able to determine that the watching and non-watching groups are equivalent (or how we might make them essentially equivalent if they aren’t already). That’s where we’ll pick up next week.6

  1. Remember the disclaimer about “we”. “We” really means Jane Carlson, Ann Marie Gawel, and Rachel Prunier. My role was primarily to sit on the sidelines and provide a little advice. I recorded some of the data when it was dictated to me, but Jane, Ann Marie, and Rachel did all of the real work.
  2. In the interest of full disclosure, I should mention that I watched only a few episodes of the X Files. That won’t surprise anyone who knows me, because anyone who knows me knows that I watch very little TV.
  3. from the Geena Davis Institute on Gender in Media
  4. One challenge that is too often overlooked is determining what “the population as a whole” means. One of the most fundamental questions to answer in interpreting any experimental or observational result is “Over what population can the pattern I observed be generalized?” As a point of information, the authors of the report note that “All differences reported here are statistically significant at the .10 level.”
  5. Or at least this is where it comes in for me.
  6. Which also means that it will be at least two weeks before we get back to Protea.

Why didn’t I investigate R Notebooks sooner?

I don’t remember when I first heard about R Notebooks, but it was quite a while ago. I finally decided to investigate them last weekend, and I’m hooked. I expect to be doing much of my R work in R notebooks from now on. For purposes of reproducibility, I still plan to extract R code from what are referred to as “chunks” to produce standalone R scripts that I can rerun from the R console to verify my results, but the interactive notebooks will allow me to run chunks as I’m developing new code and to document what I’m doing.

The link above will take you to documentation on R Notebooks. The only downside to them is that to use them you have to use RStudio. That’s not a big downside, since there is a free version available, but I’m still enough of an old fogey that my fingers are used to Emacs, and Emacs is where I generally prefer to edit code. It will take me a while to develop a new workflow, but you can be sure that it will include R Notebooks.

I produced a Randomization demo very quickly once I updated my version of RStudio. I produced HTML simply by saving my notebook to disk. The .nb.html file was automatically produced in the same directory. All I had to do was to upload it to an appropriate directory here. If you have a recent version of RStudio that supports R Notebooks, you can download the Markdown code using the “Code” dropdown at the top right of the page. Simply open the .Rmd file, and RStudio will open it as a notebook. You can then execute chunks yourself to redo the simulation in any way that you care to. That’s even easier than copying and pasting the code I posted earlier.

Causal inference in ecology – Setting the stage for the Rubin causal model

Causal inference in ecology – links to the series

If you’ve been following along, you realize by now that it’s not easy to infer the cause of a phenomenon, even in a well-controlled experiment. What about observational experiments, which are what many ecologists and evolutionary biologists have? Take one paper of mine that I’m reasonably happy with (PLoS One e52035; 2012. doi: 10.1371/journal.pone.0052035). One part of that paper included an experiment of sorts. We1 established experimental gardens at the Kirstenbosch National Botanical Garden and at a mid-elevation site on Jonaskop mountain about 100km due east of Kirstenbosch. Among other things we were interested both in how certain traits in newly formed leaves (specific leaf area, stomatal pore index, and leaf area) differed depending on the age of the plant and on the garden in which they were grown.

This part of the paper is analogous to the thought experiment on corn that we’ve discussed so far. Since the plants were grown from wild-collected seedlings, we obviously couldn’t duplicate genotypes across gardens. We also didn’t have enough seed from individual maternal plants to replicate families across the gardens. So we did the best we could. We randomized seedlings within populations and split populations across gardens. You’ll see the results from this part of the analysis in the figure below.

Figure 4 from PLoS One e52035; 2012. doi: 10.1371/journal.pone.0052035

The trends with plant age are clear.2 Specific leaf area (SLA) declines with plant age, stomatal pore index increases with plant age, and leaf area increases with plant age. Given that the trends are consistent across species and gardens, I’m reasonably confident that plant age influences these traits in this group of Protea.3 Notice that I wrote “influences”, which is a short way (for me) of writing that plant age is a causal factor that influences the traits but that I am not claiming that it is the causal factor.4

Figure 3 from PLoS One e52035; 2012. doi: 10.1371/journal.pone.0052035

Similarly, the figure above makes it clear that which garden the plants are grown in influences these (and other) traits. These results won’t surprise anyone whose worked with plants. The traits plants have depend both on how old the plant is and on where its grown. So far a reasonably straightforward experiment, but how about this? We also wanted to know whether the amount of change in leaf traits depended on measures of resource availability and rainfall seasonality in the places that seeds were collected from. Here we’re asking a more complicated question.

Take SLA, for example, and imagine that we’re asking the question just about changes in SLA for plants grown at Jonaskop. Now the fully fleshed out the question is something like this:

  • I know that plants are often adapted to the local circumstances in which they are growing.
  • If SLA reflects plant characteristics that are important in local adaptation to nutrients or water availability, then plants that grow in places that differ in nutrients or water availability should also differ in SLA in ways that make them well-suited to the place where they occur.
  • Do we have evidence that there is an association between changes in SLA and nutrient availability or precipitation patterns in the site from which they are derived?

It’s that middle step that’s tricky. We don’t need to do anything special to run a regression on changes in SLA and home site characteristics, but to interpret that regression as evidence for the causal story in that middle step we need to do something more. Unlike the nicely randomized experiment with which we began this post, we aren’t randomizing plants across sites and allowing them to adapt. What we have is purely observational data to address this question. To what extent can we make a causal inference from these data? That’s the question I’ll turn to in the next installment.

  1. By “we” I should be clear that Jane Carlson, Rachel Prunier, and Ann Marie Gawel collected all of the seeds that “we” used to establish the gardens, and they did all of the work of germinating seedlings and establishing the gardens. They also collected nearly all of the data. I helped collect a little, but my help mostly consisted of standing there with a clipboard and data sheet and writing down numbers in the appropriate columns.
  2. The same plants were measured in 2009 and 2010. In both cases, measurements were made on newly formed, but fully expanded leaves. I’m not reporting P-values, but you can find them in Table 1.
  3. The species are all members of a small, recently evolved monophyletic clade, Protea sect. Exsertae.
  4. Notice also that I am discounting the possibility that it is weather in the year the plants were growing that influences their traits rather than their age.

Causal inference in ecology – The challenge of falsification

Causal inference in ecology – links to the series

It sounds so simple. You have a hypothesis. You design an experiment to test it. If the predicted result doesn’t happen, reject the hypothesis and start over. That’s how science works, right? We can’t prove a hypothesis, but we can reject them. That’s how we make progress. That’s what makes science empirical. End of story right? Would I be asking that question if it were?

Let’s look at the logic a bit more carefully.

The hypothesis we’ve been using as an example is simple: If we apply nitrogen fertilizer, the yield of corn will increase. Our experiment is to till the soil in a field thoroughly, plant genetically uniform1 corn, and apply fertilizer on one part of the field and not the other. The test of our hypothesis is whether yield in the fertilized part of the field exceeds yield in the unfertilized part of the field. For the sake of argument, let’s suppose that the fertilized part of the field has the same yield (or less) than the unfertilized part of the field. Would you conclude that adding nitrogen fertilizer doesn’t increase corn yield? I wouldn’t, and I’ll bet you wouldn’t either. Why wouldn’t we conclude that? My logic would run like this:

  • I’m aware of a lot of other experiments, including some I’ve run myself, where adding nitrogen fertilizer to corn (and to other plants for that matter) increases yield.2 There must have been something wrong with the experimental conditions.
  • The experimental conditions include everything about the experiment.
    • It could be that I didn’t do a good job of tilling the field and mixing the soil. Maybe the part of the field that I left unfertilized happened to have much higher soil fertility, more than enough to compensate for the added nitrogen in the part of the field with lower fertility. Maybe the part of the field I fertilized happened to have minerals in the soil that immediately bound the nitrogen so that it wasn’t available to the plants.
    • It could be that there was something wrong with the fertilizer. Maybe it was a bad batch and for some reason the nitrogen wasn’t in a form that’s available for plants.
    • Maybe I didn’t do a good job of randomizing the genetic background, and I happened to have families of low-yield plants in the nitrogen fertilizer treatment.
    • Maybe I put on so much nitrogen that I “burned” the corn.

The bottom line is, there are a lot of ways that the experiment could have gone wrong. When an experiment fails to give the prediction we expected, our natural tendency is to reject the hypothesis we were testing, but strictly speaking, we don’t know whether our hypothesis is wrong, or whether there was something about our experimental conditions that made the experiment a bad test of the hypothesis.

In short, falsifying a hypothesis is hard, and we can never be certain that it’s false. It’s only by assessing the reasonableness of the experimental conditions that we can determine whether it’s our hypothesis or the experimental conditions that are faulty.

To my mind this is why we trust causal inferences from carefully controlled experiments more than those from observational studies. In a carefully controlled experiment, we make everything about the treatment and control as similar as possible, except for the difference in treatment. That way if we see a treatment effect, we have a lot more confidence in ascribing the result to the treatment not something else, and we have a lot more confidence in saying that the treatment has no effect (and our hypothesis is satisfied) if we fail to observe the expected result.

Next time we’ll talk about how to apply similar logic to observational studies and explore the challenge of making causal inferences from them.

  1. Or genetically randomized
  2. To be honest. I’ve never run such an experiment with corn, but I’ve run crude, unintentional experiments on a lot of plants I grow in my yard. I forget to fertilize some, and the difference is obvious.

Causal inference in ecology – no post this week

Last week was finals week at UConn, which means Commencement weekend began on Saturday. I represented the Provost’s Office at the PharmD ceremony at 9:00am Saturday morning, and our Master’s Commencement Ceremony was held at 1:30pm. I had yesterday off, but because of all the things that accumulated last week, I didn’t have time to write the next installment of this series. It will return next Monday, barring unforseen complications.

In the meantime, this page contains links to the posts that have appeared so far. It also contains links to some related posts that you might find interesting. You may also have noticed the “Causal inference in ecology” label at the top of this page. That’s a link to the same page of posts in case you want to find it again.

Causal inference in ecology – Randomization and sample size

Causal inference in ecology – links to the series

Last week I explored the logic behind controlled experiments and why they are typically regarded as the gold standard for identifying and measuring causal effects.1 Let me tie that post and the preceding one on counterfactuals together before we proceed with the next idea. To make things as concrete as possible, let’s return to our hypothetical example of determining whether applying nitrogen fertilizer increases the yield of corn. We do so by

  • Randomly assigning individual corn plants to different plots within a field.
  • Applying nitrogen fertilizer to some plots, the treatment plots, and not to others, the control plots.
  • Determining whether the yield in treatment plots exceeds that in the control plots.

Where do counterfactuals come in? If the yield of treatment plots exceeds that of control plots aren’t we done? Well, not quite. You see the plants that were in the treatment plots are different individuals from those that are in the control plots. To infer that nitrogen fertilizer increases yield, we have to extrapolate the results from the treatment plots to the control plots. We have to be willing to conclude that the yield in the control plots would have been greater if we had applied nitrogen fertilizer there. That’s the counterfactual. We are asserting what would have happened if the facts had been different. In practice, we don’t usually worry about this step in the logic, because we presume that our random assignment of corn plants to different plots means that the plants in the two plots are essentially equivalent. As I pointed out last time, that inference depends on having done the randomization well and having a reasonably large sample.

Let’s assume that we’ve done the randomization well, say by using a pseudorandom number generator in our computer to assign individual plants to the different plots. But let’s also assume that there is genetic variation among our corn plants that influences yield. To make things really simple, let’s assume that there’s a single locus with two alleles associated with yield differences, that high yield is dominant to low yield, and that the two alleles are in equal frequency, so that 75% of the individuals are high yield and 25% are low yield. To make things really simple let’s further assume that all of the high yield plants produce 1kg of corn (sd=0.1kg) and that all of the low yield plants produce exactly 0.5kg of corn (sd=0.1kg).2 Let’s further assume that applying nitrogen fertilizer has absolutely no effect on yield.Then a simple simulation in R produces the following results:3

Sample size:  5 
          lo:  133 
          hi:  140 
     neither:  9727 
Sample size:  10 
          lo:  201 
          hi:  175 
     neither:  9624 
Sample size:  20 
          lo:  255 
          hi:  217 
     neither:  9528 

What you can see from these results is that I was only half right. You need to do the randomization well,4 but your sample size doesn’t need to be all that big to ensure that you get reasonable results. Keep in mind that “reasonable results” here means that (a) you reject the null hypothesis of no difference in yield about 5% of the time and (b) you reject it either way at about the same frequency.5 There are, however, other reasons that you want to have reasonable sample sizes. Refer to the posts linked to on the Causal inference in ecology page for more information about that.

With counterfactuals, controlled experiments, and randomization out of the way, our next stop will be the challenge of falsification.I didn’t discuss the “and measuring” part last week, only the “identifying” part. We’ll return to measuring causal effects later in this series after we’ve explored issues associated with identifying causal effects (or exhausted ourselves trying).

  1. That corresponds to an effect size of 0.2 standard deviations.
  2. Click through to the next page to see the R code.
  3. OK, you can’t see that you need to do the randomization well, but I did it well and it worked, so why not do it well and be safe?
  4. Since I used a two-sided t-test with a 5% significance threshold, this is just what you should expect.

(more…)

Causal inference in ecology – Controlled experiments

Causal inference in ecology – links to the series

Randomized controlled experiments are generally regarded as the gold standard for identifying a causal factor.1 Let’s describe a really simple one first. Then we’ll explore why they’re regarded as the gold standard.

Picking up with the example I used last time, let’s suppose we’re trying to test the hypothesis that applying nitrogen fertilizer increases the yield of corn.2 As I pointed out, in setting up our experiment, we’d seek to control for every variable that could influence corn yield so that we can isolate the effect of nitrogen. In the simplest possible case, we’d have two adjacent plots in a field that have been plowed and tilled thoroughly so that the soil in the two plots is completely mixed and indistinguishable in every way – same content of nitrogen, phosphorous, other macronutrients, other micronutrients; same soil texture; same percent of (the same kind of) soil organic matter; same composition of clay, silt, and sand; everything.3 We’d also have plants that were genetically uniform (or as genetically uniform as we can make them), either highly inbred lines or an F1 cross produced between two highly inbred lines. We’d make sure the field was level, maybe using high-tech laser leveling devices, and we’d make sure that every plant in the entire field received the same amount of water. Since we know that the microclimate at the perimeter of the field is different from in the middle of the field, we’d make the field big enough that we could focus our measurements on a part of the field isolated from these edge effects. Then we’d randomly choose one side of the field to be the “low N” treatment and the other to be the “high N” treatment.4 After allowing the plants to grow for an appropriate amount of time, we’d harvest them, dry them, and weigh them.

Our hypothesis has the form

If N is applied to a corn field, then the yield will be greater than if it had not been applied.

Notice that we can’t both apply N and not apply N to the same set of plants. We have to compare what happens when we apply N to one set of plants and don’t apply it to another. If we find that the “high N” plants have a greater yield than the “low N” plants, we infer that the “low N” plants would also have had a greater yield if we had applied N to them (which we didn’t). Why is that justified? Because everything about the two treatments is identical, by design, except for the amount of N applied. If there’s a difference in yield, it can only be attributed to something that differs between the treatments, and the only thing that differs is the amount of N applied.

I can hear you thinking, “Couldn’t the difference just be due to chance?” Well, yes it could. If we do a statistical test and demonstrate that the yields are statistically distinguishable, that increases our confidence that the difference in yield is real, but nothing can ever make the conclusion logically certain in the way we can be logically certain that 2+2=4.5 To my mind there are two things that make us accept the outcome of this experiment as evidence that applying N increases corn yield:

  1. It’s not just this experiment. If the same experiment is repeated in different places with different soil types, different corn genotypes, and different weather patterns, we get the same result. We can never be certain, but the consistency of that result increases our confidence that the association isn’t just a fluke.
  2. What we understand about plant growth and physiology leads us to expect that providing nitrogen in fertilizer should enhance plant growth. In other words, this particular hypothesis is part of a larger theoretical framework about plant physiology and development. That framework provides a coherent and repeatable set of predictions across a wide empirical domain.

Put those two together, and we have good reason for thinking that the observed association between N fertilizer and corn yield is actually a causal association.

In experiments where we can’t completely control all relevant variables except the one that we’re interested in, we rely on randomization. Suppose, for example, we couldn’t produce genetically uniform corn. Then we’d randomize the assignment of individuals to the “high” and “low” treatments. The results aren’t quite as solid as if we’d had complete uniformity. It’s always possible that by some statistical fluke a factor we aren’t measuring ends up overrepresented in one treatment and underrepresented in the other, but if we’ve randomized well and we have a reasonably large sample, the chances are small. So our inference isn’t quite as firm, but it’s still pretty goo.

We’ll explore the “reasonably large sample question” in the next installment.

  1. See, for example, Rubin (Annals of Applied Statistics 2:808-840; 2008. https://projecteuclid.org/euclid.aoas/1223908042)
  2. If you know me or my work, you know that I’m not at all crazy about the null hypothesis testing approach to investigating ecology. We’ll get to that later, but let’s start with a simple case. Even those of us who don’t like null hypothesis testing as a general approach recognize that it has value. We’ll focus on one way in which it has value here.
  3. If we were really fastidious we might even set up the experiment in a large growth chamber in which we mixed the soil together and distributed it evenly ourselves.
  4. If we were really paranoid about controlling for all possible factors, we’d even randomly assign a nitrogen fertilizer level (high or low) to every different plant in the field, and we’d probably do the whole experiment in a very large growth chamber where we could mix the soil ourselves and ensure that light, humidity, and temperature were as uniform as possible across all individuals in the experiment.
  5. If you don’t see why, Google “problem of induction” and you’ll get some idea. If that doesn’t satisfy you, ask, and I’ll see what I can do to provide an explanation.

Causal inference in ecology – Counterfactuals

Causal inference in ecology – links to the series

Let’s start with a few preliminaries.1

  • A causal factor (“cause” for short) is something that is predictably related to a particular outcome. For example, fertilizing crops generally increases their yield, so fertilizer is a causal factor related to yield. The way I think about it, a causal factor need not always lead to the outcome. It’s enough if it merely increases the probability of the outcome. For example, smoking doesn’t always lead to lung cancer among those who smoke, but it does increase the probability that you will suffer from lung cancer if you smoke.
  • Causes precede effects.2 That’s one reason why teleology is problematic. A teleological explanation explains the current state of things as a result of, i.e., as caused by, something in the future, namely a purpose.3
  • Effects may have multiple causes. The world, or at least the world of biology, is a complicated place. Regardless of what phenomenon you’re studying, there are likely to be several (or many) causal factors that influence.

The last point is one of the most important ones for purposes of this series. When we are investigating a phenomenon,4 we’re trying to discern which of several plausible causal factors plays a role and, possibly, the relative “importance” of those causal factors.5

To make this concrete, let’s suppose that we’re trying to determine whether application of nitrogen fertilizer increases the yield of corn. That means we have to determine whether adding nitrogen and adding nitrogen alone increases corn yield. Why the emphasis on “adding nitrogen alone”? Suppose that we added nitrogen to a corn field by adding manure. Then increases in the amount of applied nitrogen are associated with increases in the amount of a host of other substances. If yields increased, we’d know that adding manure increases yield, but not whether it’s because of the nitrogen in manure or something else. Why does this matter?

From very early on in our education we’re taught that “correlation is not the same as causation.” We want to distinguish cases where A causes B from cases where A is merely correlated with B. Yet, as David Hume pointed out long ago, experience6 alone can only show us that A and B actually occur together, not that they must occur together (link). One way of distinguishing cause from correlation is that causes support counterfactual statements. They provide us with a reason to believe statements like “If we had applied nitrogen to the field, the corn yield would have increased” even if we never applied nitrogen to the field at all. The only reason I can see that we could believe such a statement is if we had already determined that adding nitrogen and adding nitrogen alone increases corn yield.7

How do we determine that? Randomized controlled experiments are the most widely known approach, and they are typically regarded as the gold standard against which all other means of inference are compared. That’s where we’ll pick up in the next installment.

  1. As I warned in the introduction to the series, I am not an expert in causal inference. The terminology I use is likely both to be imprecise and to be somewhat different from the terminology experts use.
  2. Philosophers have argued about whether backward causation is possible, but I’m going to ignore that possibility.
  3. Biologists sometimes use teleological language to explain adaptation, e.g., land animals evolved legs to provide mobility. It is, however, relatively easy (if a bit long-winded) to eliminate the teleological language, because natural selection shows how adaptations arise from differential reproduction and survival (link).
  4. Or at least this is how it is when I’m investigating a phenomenon.
  5. I’ll come back to the idea of identifying the relative importance of causal factors in a future post.
  6. Or experiment.
  7. If there are any philosophers reading this, you’ll recognize that this account is horribly sketchy and amounts to little more than proof by vigorous assertion. If you’re so inclined, I invite you to flesh out more complete explanations for readers who are interested.

Causal inference in ecology – Introduction to the series

If you’ve been following posts here since the first of the year, you know that I’ve been writing about how I keep myself organized. Today I’m starting a completely different series in which I begin to collect my thoughts on how we can make judgments about the cause (or causes) of ecological phenomena1 and the circumstances under which judgments are possible. Before I start, I need to offer a few disclaimers.

  • Any evolutionary biologist or ecologist who knows me and my work knows that it’s not uncommon for my ideas to represent a minority opinion. (Think pollen discounting for those of you who know my work on the evolution of plant mating systems.) I make no claim that anything I write here is broadly representative of what my fellow evolutionary biologists and ecologists think, only that it’s what I think. Please challenge me on anything you think I’ve got wrong, because I’m sure there will be things I get wrong, and the easiest way for me to discover those errors is for someone else to point them out.
  • I had a minor in Philosophy as an undergraduate and there is an enormous literature on causality in the philosophy of science. I’ll be using a very crude understanding of “cause.” I don’t think it is wildly misleading, but I’m certain it wouldn’t stand up to serious scrutiny.2
  • I’ll be thinking about causal inference in the specific context of trying to infer causes from observational data using statistics rather than from inferring causes controlled experiments.3 I’ll be using an approach developed in the 1970s by Donald Rubin, the Rubin Causal Model.4
  • There is a very large literature on causal inference in the social sciences. I’ll be drawing heavily on Imbens and Rubin, Causal Inference for Statistics, Social and Biomedical Sciences: An Introduction,5 but there’s an enormous amount of material there that I won’t attempt to cover. I am also pretty new to the concepts associated with the Rubin causal model, so it’s entirely possible that I’ll misrepresent or misinterpret a point that the real experts got right. In other words, if something I say doesn’t make any sense, it’s more likely I got it wrong than that Imbens and Rubin got it wrong.

Although I will be thinking about causal inference in the context of observational data and statistics, I don’t plan to write much (if at all) about the problems with P-values, Bayes factors, credible/confidence intervals overlapping 0 (or not), and the like. If you’d like to know the concerns I have about them, here are links to old posts on those issues.

  1. I’m calling the post “Causal inference in ecology” only because “Causal inference in ecology, evolutionary biology, and population genetics” would be too long.
  2. There’s a good chance that a moderately competent undergraduate Philosophy major would find it woefully inadequate.
  3. To be more precise, we don’t infer causes from controlled experiments. Rather, we have pre-existing hypotheses about possible causes, and we use controlled experiments to test those hypotheses.
  4. In my relatively limited reading on the subject, I’ve most often seen it referred to as the Rubin causal model, but it is sometimes referred to as the Neyman causal model.
  5. Reminder: If you click on that link, it will take you to Amazon.com. I use that link simply because it’s convenient. You can buy the book, if you’re so inclined, from many other outlets. I am not an Amazon affiliate, and I will not receive any compensation if you decide to buy the book regardless of whether you buy it at Amazon or elsewhere. By the way, Chapter 23 in Gelman and Hill’s book, Data Analysis Using Regression and Multilevel/Hierarchical Models has an excellent overview of the Rubin causal model.