Uncommon Ground

Academics, biodiversity, genetics, & evolution

Latest Posts

Causal inference in ecology – Setting the stage for the Rubin causal model

Causal inference in ecology – links to the series

If you’ve been following along, you realize by now that it’s not easy to infer the cause of a phenomenon, even in a well-controlled experiment. What about observational experiments, which are what many ecologists and evolutionary biologists have? Take one paper of mine that I’m reasonably happy with (PLoS One e52035; 2012. doi: 10.1371/journal.pone.0052035). One part of that paper included an experiment of sorts. We1 established experimental gardens at the Kirstenbosch National Botanical Garden and at a mid-elevation site on Jonaskop mountain about 100km due east of Kirstenbosch. Among other things we were interested both in how certain traits in newly formed leaves (specific leaf area, stomatal pore index, and leaf area) differed depending on the age of the plant and on the garden in which they were grown.

This part of the paper is analogous to the thought experiment on corn that we’ve discussed so far. Since the plants were grown from wild-collected seedlings, we obviously couldn’t duplicate genotypes across gardens. We also didn’t have enough seed from individual maternal plants to replicate families across the gardens. So we did the best we could. We randomized seedlings within populations and split populations across gardens. You’ll see the results from this part of the analysis in the figure below.

Figure 4 from PLoS One e52035; 2012. doi: 10.1371/journal.pone.0052035

The trends with plant age are clear.2 Specific leaf area (SLA) declines with plant age, stomatal pore index increases with plant age, and leaf area increases with plant age. Given that the trends are consistent across species and gardens, I’m reasonably confident that plant age influences these traits in this group of Protea.3 Notice that I wrote “influences”, which is a short way (for me) of writing that plant age is a causal factor that influences the traits but that I am not claiming that it is the causal factor.4

Figure 3 from PLoS One e52035; 2012. doi: 10.1371/journal.pone.0052035

Similarly, the figure above makes it clear that which garden the plants are grown in influences these (and other) traits. These results won’t surprise anyone whose worked with plants. The traits plants have depend both on how old the plant is and on where its grown. So far a reasonably straightforward experiment, but how about this? We also wanted to know whether the amount of change in leaf traits depended on measures of resource availability and rainfall seasonality in the places that seeds were collected from. Here we’re asking a more complicated question.

Take SLA, for example, and imagine that we’re asking the question just about changes in SLA for plants grown at Jonaskop. Now the fully fleshed out the question is something like this:

  • I know that plants are often adapted to the local circumstances in which they are growing.
  • If SLA reflects plant characteristics that are important in local adaptation to nutrients or water availability, then plants that grow in places that differ in nutrients or water availability should also differ in SLA in ways that make them well-suited to the place where they occur.
  • Do we have evidence that there is an association between changes in SLA and nutrient availability or precipitation patterns in the site from which they are derived?

It’s that middle step that’s tricky. We don’t need to do anything special to run a regression on changes in SLA and home site characteristics, but to interpret that regression as evidence for the causal story in that middle step we need to do something more. Unlike the nicely randomized experiment with which we began this post, we aren’t randomizing plants across sites and allowing them to adapt. What we have is purely observational data to address this question. To what extent can we make a causal inference from these data? That’s the question I’ll turn to in the next installment.

  1. By “we” I should be clear that Jane Carlson, Rachel Prunier, and Ann Marie Gawel collected all of the seeds that “we” used to establish the gardens, and they did all of the work of germinating seedlings and establishing the gardens. They also collected nearly all of the data. I helped collect a little, but my help mostly consisted of standing there with a clipboard and data sheet and writing down numbers in the appropriate columns.
  2. The same plants were measured in 2009 and 2010. In both cases, measurements were made on newly formed, but fully expanded leaves. I’m not reporting P-values, but you can find them in Table 1.
  3. The species are all members of a small, recently evolved monophyletic clade, Protea sect. Exsertae.
  4. Notice also that I am discounting the possibility that it is weather in the year the plants were growing that influences their traits rather than their age.

Honoring Ruth Millikan

Ruth Millikan is Emeritus Board of Trustees Distinguished Professor of Philosophy at UConn. Quoting from her web page, Ruth’s “research interests span many topics in the philosophy of biology, philosophy of mind, philosophy of language, and ontology.” She is a highly respected and influential philosopher. From her Wikipedia page:

She was awarded the Jean Nicod Prize and gave the Jean Nicod Lectures in Paris in 2002.[3] She was elected to the American Academy of Arts and Sciences in 2014 [4] and received, in 2017, both the Nicholas Rescher Prize for Systematic Philosophy from the University of Pittsburgh[5] and the Rolf Schock Prize in Logic and Philosophy.[6]

On April 30th I had the great honor of presenting a few remarks at an event held to celebrate Ruth’s contributions and to inaugurate the Ruth Garrett Millikan Endowment to support graduate students. Daniel Dennett was the featured speaker, and he highlighted Ruth’s contributions, focusing especially on one of her early books – Language, Thought, and Other Biological Categories – and her most recent one – Beyond Concepts. If you want to understand why her work is so important, you’ll need to read those books yourself. Her Wikipedia page provides only a very brief summary.

My comments focused on why graduate education, particularly PhD education, and financial support for graduate education is vital. On the off chance you’re interested in reading what I had to say, the full text of my remarks follows.


Causal inference in ecology – The challenge of falsification

Causal inference in ecology – links to the series

It sounds so simple. You have a hypothesis. You design an experiment to test it. If the predicted result doesn’t happen, reject the hypothesis and start over. That’s how science works, right? We can’t prove a hypothesis, but we can reject them. That’s how we make progress. That’s what makes science empirical. End of story right? Would I be asking that question if it were?

Let’s look at the logic a bit more carefully.

The hypothesis we’ve been using as an example is simple: If we apply nitrogen fertilizer, the yield of corn will increase. Our experiment is to till the soil in a field thoroughly, plant genetically uniform1 corn, and apply fertilizer on one part of the field and not the other. The test of our hypothesis is whether yield in the fertilized part of the field exceeds yield in the unfertilized part of the field. For the sake of argument, let’s suppose that the fertilized part of the field has the same yield (or less) than the unfertilized part of the field. Would you conclude that adding nitrogen fertilizer doesn’t increase corn yield? I wouldn’t, and I’ll bet you wouldn’t either. Why wouldn’t we conclude that? My logic would run like this:

  • I’m aware of a lot of other experiments, including some I’ve run myself, where adding nitrogen fertilizer to corn (and to other plants for that matter) increases yield.2 There must have been something wrong with the experimental conditions.
  • The experimental conditions include everything about the experiment.
    • It could be that I didn’t do a good job of tilling the field and mixing the soil. Maybe the part of the field that I left unfertilized happened to have much higher soil fertility, more than enough to compensate for the added nitrogen in the part of the field with lower fertility. Maybe the part of the field I fertilized happened to have minerals in the soil that immediately bound the nitrogen so that it wasn’t available to the plants.
    • It could be that there was something wrong with the fertilizer. Maybe it was a bad batch and for some reason the nitrogen wasn’t in a form that’s available for plants.
    • Maybe I didn’t do a good job of randomizing the genetic background, and I happened to have families of low-yield plants in the nitrogen fertilizer treatment.
    • Maybe I put on so much nitrogen that I “burned” the corn.

The bottom line is, there are a lot of ways that the experiment could have gone wrong. When an experiment fails to give the prediction we expected, our natural tendency is to reject the hypothesis we were testing, but strictly speaking, we don’t know whether our hypothesis is wrong, or whether there was something about our experimental conditions that made the experiment a bad test of the hypothesis.

In short, falsifying a hypothesis is hard, and we can never be certain that it’s false. It’s only by assessing the reasonableness of the experimental conditions that we can determine whether it’s our hypothesis or the experimental conditions that are faulty.

To my mind this is why we trust causal inferences from carefully controlled experiments more than those from observational studies. In a carefully controlled experiment, we make everything about the treatment and control as similar as possible, except for the difference in treatment. That way if we see a treatment effect, we have a lot more confidence in ascribing the result to the treatment not something else, and we have a lot more confidence in saying that the treatment has no effect (and our hypothesis is satisfied) if we fail to observe the expected result.

Next time we’ll talk about how to apply similar logic to observational studies and explore the challenge of making causal inferences from them.

  1. Or genetically randomized
  2. To be honest. I’ve never run such an experiment with corn, but I’ve run crude, unintentional experiments on a lot of plants I grow in my yard. I forget to fertilize some, and the difference is obvious.

Causal inference in ecology – no post this week

Last week was finals week at UConn, which means Commencement weekend began on Saturday. I represented the Provost’s Office at the PharmD ceremony at 9:00am Saturday morning, and our Master’s Commencement Ceremony was held at 1:30pm. I had yesterday off, but because of all the things that accumulated last week, I didn’t have time to write the next installment of this series. It will return next Monday, barring unforseen complications.

In the meantime, this page contains links to the posts that have appeared so far. It also contains links to some related posts that you might find interesting. You may also have noticed the “Causal inference in ecology” label at the top of this page. That’s a link to the same page of posts in case you want to find it again.

Causal inference in ecology – Randomization and sample size

Causal inference in ecology – links to the series

Last week I explored the logic behind controlled experiments and why they are typically regarded as the gold standard for identifying and measuring causal effects.1 Let me tie that post and the preceding one on counterfactuals together before we proceed with the next idea. To make things as concrete as possible, let’s return to our hypothetical example of determining whether applying nitrogen fertilizer increases the yield of corn. We do so by

  • Randomly assigning individual corn plants to different plots within a field.
  • Applying nitrogen fertilizer to some plots, the treatment plots, and not to others, the control plots.
  • Determining whether the yield in treatment plots exceeds that in the control plots.

Where do counterfactuals come in? If the yield of treatment plots exceeds that of control plots aren’t we done? Well, not quite. You see the plants that were in the treatment plots are different individuals from those that are in the control plots. To infer that nitrogen fertilizer increases yield, we have to extrapolate the results from the treatment plots to the control plots. We have to be willing to conclude that the yield in the control plots would have been greater if we had applied nitrogen fertilizer there. That’s the counterfactual. We are asserting what would have happened if the facts had been different. In practice, we don’t usually worry about this step in the logic, because we presume that our random assignment of corn plants to different plots means that the plants in the two plots are essentially equivalent. As I pointed out last time, that inference depends on having done the randomization well and having a reasonably large sample.

Let’s assume that we’ve done the randomization well, say by using a pseudorandom number generator in our computer to assign individual plants to the different plots. But let’s also assume that there is genetic variation among our corn plants that influences yield. To make things really simple, let’s assume that there’s a single locus with two alleles associated with yield differences, that high yield is dominant to low yield, and that the two alleles are in equal frequency, so that 75% of the individuals are high yield and 25% are low yield. To make things really simple let’s further assume that all of the high yield plants produce 1kg of corn (sd=0.1kg) and that all of the low yield plants produce exactly 0.5kg of corn (sd=0.1kg).2 Let’s further assume that applying nitrogen fertilizer has absolutely no effect on yield.Then a simple simulation in R produces the following results:3

Sample size:  5 
          lo:  133 
          hi:  140 
     neither:  9727 
Sample size:  10 
          lo:  201 
          hi:  175 
     neither:  9624 
Sample size:  20 
          lo:  255 
          hi:  217 
     neither:  9528 

What you can see from these results is that I was only half right. You need to do the randomization well,4 but your sample size doesn’t need to be all that big to ensure that you get reasonable results. Keep in mind that “reasonable results” here means that (a) you reject the null hypothesis of no difference in yield about 5% of the time and (b) you reject it either way at about the same frequency.5 There are, however, other reasons that you want to have reasonable sample sizes. Refer to the posts linked to on the Causal inference in ecology page for more information about that.

With counterfactuals, controlled experiments, and randomization out of the way, our next stop will be the challenge of falsification.I didn’t discuss the “and measuring” part last week, only the “identifying” part. We’ll return to measuring causal effects later in this series after we’ve explored issues associated with identifying causal effects (or exhausted ourselves trying).

  1. That corresponds to an effect size of 0.2 standard deviations.
  2. Click through to the next page to see the R code.
  3. OK, you can’t see that you need to do the randomization well, but I did it well and it worked, so why not do it well and be safe?
  4. Since I used a two-sided t-test with a 5% significance threshold, this is just what you should expect.


Causal inference in ecology – Controlled experiments

Causal inference in ecology – links to the series

Randomized controlled experiments are generally regarded as the gold standard for identifying a causal factor.1 Let’s describe a really simple one first. Then we’ll explore why they’re regarded as the gold standard.

Picking up with the example I used last time, let’s suppose we’re trying to test the hypothesis that applying nitrogen fertilizer increases the yield of corn.2 As I pointed out, in setting up our experiment, we’d seek to control for every variable that could influence corn yield so that we can isolate the effect of nitrogen. In the simplest possible case, we’d have two adjacent plots in a field that have been plowed and tilled thoroughly so that the soil in the two plots is completely mixed and indistinguishable in every way – same content of nitrogen, phosphorous, other macronutrients, other micronutrients; same soil texture; same percent of (the same kind of) soil organic matter; same composition of clay, silt, and sand; everything.3 We’d also have plants that were genetically uniform (or as genetically uniform as we can make them), either highly inbred lines or an F1 cross produced between two highly inbred lines. We’d make sure the field was level, maybe using high-tech laser leveling devices, and we’d make sure that every plant in the entire field received the same amount of water. Since we know that the microclimate at the perimeter of the field is different from in the middle of the field, we’d make the field big enough that we could focus our measurements on a part of the field isolated from these edge effects. Then we’d randomly choose one side of the field to be the “low N” treatment and the other to be the “high N” treatment.4 After allowing the plants to grow for an appropriate amount of time, we’d harvest them, dry them, and weigh them.

Our hypothesis has the form

If N is applied to a corn field, then the yield will be greater than if it had not been applied.

Notice that we can’t both apply N and not apply N to the same set of plants. We have to compare what happens when we apply N to one set of plants and don’t apply it to another. If we find that the “high N” plants have a greater yield than the “low N” plants, we infer that the “low N” plants would also have had a greater yield if we had applied N to them (which we didn’t). Why is that justified? Because everything about the two treatments is identical, by design, except for the amount of N applied. If there’s a difference in yield, it can only be attributed to something that differs between the treatments, and the only thing that differs is the amount of N applied.

I can hear you thinking, “Couldn’t the difference just be due to chance?” Well, yes it could. If we do a statistical test and demonstrate that the yields are statistically distinguishable, that increases our confidence that the difference in yield is real, but nothing can ever make the conclusion logically certain in the way we can be logically certain that 2+2=4.5 To my mind there are two things that make us accept the outcome of this experiment as evidence that applying N increases corn yield:

  1. It’s not just this experiment. If the same experiment is repeated in different places with different soil types, different corn genotypes, and different weather patterns, we get the same result. We can never be certain, but the consistency of that result increases our confidence that the association isn’t just a fluke.
  2. What we understand about plant growth and physiology leads us to expect that providing nitrogen in fertilizer should enhance plant growth. In other words, this particular hypothesis is part of a larger theoretical framework about plant physiology and development. That framework provides a coherent and repeatable set of predictions across a wide empirical domain.

Put those two together, and we have good reason for thinking that the observed association between N fertilizer and corn yield is actually a causal association.

In experiments where we can’t completely control all relevant variables except the one that we’re interested in, we rely on randomization. Suppose, for example, we couldn’t produce genetically uniform corn. Then we’d randomize the assignment of individuals to the “high” and “low” treatments. The results aren’t quite as solid as if we’d had complete uniformity. It’s always possible that by some statistical fluke a factor we aren’t measuring ends up overrepresented in one treatment and underrepresented in the other, but if we’ve randomized well and we have a reasonably large sample, the chances are small. So our inference isn’t quite as firm, but it’s still pretty goo.

We’ll explore the “reasonably large sample question” in the next installment.

  1. See, for example, Rubin (Annals of Applied Statistics 2:808-840; 2008. https://projecteuclid.org/euclid.aoas/1223908042)
  2. If you know me or my work, you know that I’m not at all crazy about the null hypothesis testing approach to investigating ecology. We’ll get to that later, but let’s start with a simple case. Even those of us who don’t like null hypothesis testing as a general approach recognize that it has value. We’ll focus on one way in which it has value here.
  3. If we were really fastidious we might even set up the experiment in a large growth chamber in which we mixed the soil together and distributed it evenly ourselves.
  4. If we were really paranoid about controlling for all possible factors, we’d even randomly assign a nitrogen fertilizer level (high or low) to every different plant in the field, and we’d probably do the whole experiment in a very large growth chamber where we could mix the soil ourselves and ensure that light, humidity, and temperature were as uniform as possible across all individuals in the experiment.
  5. If you don’t see why, Google “problem of induction” and you’ll get some idea. If that doesn’t satisfy you, ask, and I’ll see what I can do to provide an explanation.

Made it to Brisbane

It’s among the longest stretch of (planned) travel that I’ve done.1 I

  • left Hartford at 11:45am EDT on Thursday, April 19
  • arrived in Cinncinnati at 1:53pm,
  • left Cinncinnati at 2:45pm,
  • arrived in Los Angeles at 4:45pm PDT,
  • left Los Angeles at 10:30, and
  • arrived in Brisbane at 5:30am Australian Eastern time on Saturday, April 21.

There’s a 14 hour time difference between Brisbane and Hartford. That makes the total travel time 27 hours, 45 minutes gate to gate. I arrived at my hotel about 2 1/2 hours ago. Remarkably, they had a room they could give me, even though the official check in time isn’t until 2:00pm. It’s a very comfortable room in what appears to be a very nice part of the city. I don’t have any meetings until tomorrow. Once I’ve finished up a couple of things I want to do, I’m going to put on some comfortable shoes and go for a walk around town with my camera. First stop, the City Botanic Gardens. I’ll miss the farmer’s market, which is held tomorrow, and I’m not sure what I’ll visit after the botanic gardens, but I’m going to keep going all day. If I can go to bed at something resembling a normal time, there’s a good chance I’ll escape the worst effects of jet lag tomorrow.

The photo is the view from my hotel room.

  1. A few years ago it took me 3 1/2 days to get home from Capetown. I was stranded in Amsterdam for 2 nights. Yes I mean stranded. The first night I was stuck in the airport. The second night I was at an airport hotel, but I didn’t get there until 3 in the afternoon – too late to go into the city and enjoy anything.

On my way to Brisbane

UConn is a member of Universitas 21, an international group of universities dedicated to excellence in research and education. Every year Deans and Directors of Graduate Studies (DDoGS) of U21 universities gather at one of the member universities to exchange ideas about improving graduate education. This year the discussions will include sessions on providing support for career planning, advising and mentoring graduate students and postdocs, and entrepreneurship. The meetings begin Sunday morning and continue through Tuesday – three very full days of vigorous discussion.

This year, the University of Queensland is hosting the meeting, hence my travel to Brisbane. I’m sitting in the departure lounge at Bradley as I type this, and I expect to land in Brisbane about 36 hours from now. If time permits, I’ll send an update or two from the DDoGS meeting. If not, expect a short report after I return.

Causal inference in ecology – Counterfactuals

Causal inference in ecology – links to the series

Let’s start with a few preliminaries.1

  • A causal factor (“cause” for short) is something that is predictably related to a particular outcome. For example, fertilizing crops generally increases their yield, so fertilizer is a causal factor related to yield. The way I think about it, a causal factor need not always lead to the outcome. It’s enough if it merely increases the probability of the outcome. For example, smoking doesn’t always lead to lung cancer among those who smoke, but it does increase the probability that you will suffer from lung cancer if you smoke.
  • Causes precede effects.2 That’s one reason why teleology is problematic. A teleological explanation explains the current state of things as a result of, i.e., as caused by, something in the future, namely a purpose.3
  • Effects may have multiple causes. The world, or at least the world of biology, is a complicated place. Regardless of what phenomenon you’re studying, there are likely to be several (or many) causal factors that influence.

The last point is one of the most important ones for purposes of this series. When we are investigating a phenomenon,4 we’re trying to discern which of several plausible causal factors plays a role and, possibly, the relative “importance” of those causal factors.5

To make this concrete, let’s suppose that we’re trying to determine whether application of nitrogen fertilizer increases the yield of corn. That means we have to determine whether adding nitrogen and adding nitrogen alone increases corn yield. Why the emphasis on “adding nitrogen alone”? Suppose that we added nitrogen to a corn field by adding manure. Then increases in the amount of applied nitrogen are associated with increases in the amount of a host of other substances. If yields increased, we’d know that adding manure increases yield, but not whether it’s because of the nitrogen in manure or something else. Why does this matter?

From very early on in our education we’re taught that “correlation is not the same as causation.” We want to distinguish cases where A causes B from cases where A is merely correlated with B. Yet, as David Hume pointed out long ago, experience6 alone can only show us that A and B actually occur together, not that they must occur together (link). One way of distinguishing cause from correlation is that causes support counterfactual statements. They provide us with a reason to believe statements like “If we had applied nitrogen to the field, the corn yield would have increased” even if we never applied nitrogen to the field at all. The only reason I can see that we could believe such a statement is if we had already determined that adding nitrogen and adding nitrogen alone increases corn yield.7

How do we determine that? Randomized controlled experiments are the most widely known approach, and they are typically regarded as the gold standard against which all other means of inference are compared. That’s where we’ll pick up in the next installment.

  1. As I warned in the introduction to the series, I am not an expert in causal inference. The terminology I use is likely both to be imprecise and to be somewhat different from the terminology experts use.
  2. Philosophers have argued about whether backward causation is possible, but I’m going to ignore that possibility.
  3. Biologists sometimes use teleological language to explain adaptation, e.g., land animals evolved legs to provide mobility. It is, however, relatively easy (if a bit long-winded) to eliminate the teleological language, because natural selection shows how adaptations arise from differential reproduction and survival (link).
  4. Or at least this is how it is when I’m investigating a phenomenon.
  5. I’ll come back to the idea of identifying the relative importance of causal factors in a future post.
  6. Or experiment.
  7. If there are any philosophers reading this, you’ll recognize that this account is horribly sketchy and amounts to little more than proof by vigorous assertion. If you’re so inclined, I invite you to flesh out more complete explanations for readers who are interested.