Andrew Gelman has a long, interesting, and important post about designing exploratory studies. It was inspired by the following comment from Ed Hagen following a blog post about a paper in Psychological Science.
Exploratory studies need to become a “thing”. Right now, they play almost no formal role in social science, yet they are essential to good social science. That means we need to put as much effort in developing standards, procedures, and techniques for exploratory studies as we have for confirmatory studies. And we need academic norms that reward good exploratory studies so there is less incentive to disguise them as confirmatory.
I think Ed’s suggestion is too narrow. Exploratory studies are essential to good science, not just good social science. We often (or at least I often) have only a vague idea about how features I’m interested in relate to one another. Take leaf mass per area (LMA)
We often (or at least I often) have only a vague idea about how features I’m interested in relate to one another. Take leaf mass per area (LMA)1 and mean annual temperature or mean annual precipitation, for example. In a worldwide dataset compiled by IJ Wright and colleagues2, tougher leaves (higher values of LMA) are associated with warmer temperatures and less rainfall.
We expected similar relationships in our analysis of Protea and Pelargonium,3 but we weren’t trying to test those expectations. We were trying to determine what those relationships were. We were, in other words, exploring our data, and guess what we found. Tougher leaves are associated with less rainfall in both general and with warmer temperatures in Protea. They were, however, associated with cooler temperatures in Pelargonium, exactly the opposite of what we expected. One reason for the difference might be that Pelargonium leaves are drought deciduous, so they avoid the summer drought characteristic of the regions from which our samples were collected. That is, of course, a post hoc explanation and has to be interpreted cautiously as a hypothesis to be tested, not as an established causal explanation. But that is precisely the point. We needed to investigate the phenomena to identify a pattern. Only then could we generate a hypothesis worth testing.
I find that I am usually more interested in discovering what the phenomena are than in tying down the mechanistic explanations for them. The problem, as Ed Hagen suggests, is that studies that explicitly label themselves as exploratory play little role in science. They tend to be seen as “fishing expeditions,” not serious science. The key, as Hagen suggests, is that to be useful, exploratory studies have to be done as carefully as explicit, hypothesis-testing confirmatory studies. A corollary he didn’t mention is that science will be well-served if we observe the distinction between exploratory and confirmatory studies.4
1LMA is a widely used measure of how “tough” leaves are. It’s a key component of the worldwide leaf economics spectrum.
2Wright, IJ et al. 2004. The worldwide leaf economics spectrum. Nature 428:821-827 doi: 10.1038/nature02403
3Mitchell, N et al. 2015. Functional traits in parallel evolutionary radiations and trait-environment associations in the Cape Floristic Region of South Africa. American Naturalist 185:525-537 doi: 10.1086/680051
4Apologies to any social scientists who may be reading this, especially if you spend your days doing factor analysis. This will all be old hat to you. It may surprise you to learn that the distinction between exploratory and confirmatory analyses is almost unknown to ecologists and evolutionary biologists. At least it is almost unknown if I and the people I know are at all representative of the larger community.