It sounds so simple. You have a hypothesis. You design an experiment to test it. If the predicted result doesn’t happen, reject the hypothesis and start over. That’s how science works, right? We can’t prove a hypothesis, but we can reject them. That’s how we make progress. That’s what makes science empirical. End of story right? Would I be asking that question if it were?
Let’s look at the logic a bit more carefully.
The hypothesis we’ve been using as an example is simple: If we apply nitrogen fertilizer, the yield of corn will increase. Our experiment is to till the soil in a field thoroughly, plant genetically uniform1 corn, and apply fertilizer on one part of the field and not the other. The test of our hypothesis is whether yield in the fertilized part of the field exceeds yield in the unfertilized part of the field. For the sake of argument, let’s suppose that the fertilized part of the field has the same yield (or less) than the unfertilized part of the field. Would you conclude that adding nitrogen fertilizer doesn’t increase corn yield? I wouldn’t, and I’ll bet you wouldn’t either. Why wouldn’t we conclude that? My logic would run like this:
- I’m aware of a lot of other experiments, including some I’ve run myself, where adding nitrogen fertilizer to corn (and to other plants for that matter) increases yield.2 There must have been something wrong with the experimental conditions.
- The experimental conditions include everything about the experiment.
- It could be that I didn’t do a good job of tilling the field and mixing the soil. Maybe the part of the field that I left unfertilized happened to have much higher soil fertility, more than enough to compensate for the added nitrogen in the part of the field with lower fertility. Maybe the part of the field I fertilized happened to have minerals in the soil that immediately bound the nitrogen so that it wasn’t available to the plants.
- It could be that there was something wrong with the fertilizer. Maybe it was a bad batch and for some reason the nitrogen wasn’t in a form that’s available for plants.
- Maybe I didn’t do a good job of randomizing the genetic background, and I happened to have families of low-yield plants in the nitrogen fertilizer treatment.
- Maybe I put on so much nitrogen that I “burned” the corn.
The bottom line is, there are a lot of ways that the experiment could have gone wrong. When an experiment fails to give the prediction we expected, our natural tendency is to reject the hypothesis we were testing, but strictly speaking, we don’t know whether our hypothesis is wrong, or whether there was something about our experimental conditions that made the experiment a bad test of the hypothesis.
In short, falsifying a hypothesis is hard, and we can never be certain that it’s false. It’s only by assessing the reasonableness of the experimental conditions that we can determine whether it’s our hypothesis or the experimental conditions that are faulty.
To my mind this is why we trust causal inferences from carefully controlled experiments more than those from observational studies. In a carefully controlled experiment, we make everything about the treatment and control as similar as possible, except for the difference in treatment. That way if we see a treatment effect, we have a lot more confidence in ascribing the result to the treatment not something else, and we have a lot more confidence in saying that the treatment has no effect (and our hypothesis is satisfied) if we fail to observe the expected result.
Next time we’ll talk about how to apply similar logic to observational studies and explore the challenge of making causal inferences from them.