{"id":583,"date":"2018-04-23T08:30:00","date_gmt":"2018-04-23T12:30:00","guid":{"rendered":"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/?p=583"},"modified":"2018-04-20T19:52:33","modified_gmt":"2018-04-20T23:52:33","slug":"causal-inference-in-ecology-controlled-experiments","status":"publish","type":"post","link":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2018\/04\/23\/causal-inference-in-ecology-controlled-experiments\/","title":{"rendered":"Causal inference in ecology &#8211; Controlled experiments"},"content":{"rendered":"<p><a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/causal-inference-in-ecology\/\">Causal inference in ecology &#8211; links to the series<\/a><\/p>\n<p>Randomized controlled experiments are generally regarded as the gold standard for identifying a causal factor.<sup><a id=\"ffn1\" class=\"footnote\" href=\"#fn1\">1<\/a><\/sup> Let\u2019s describe a really simple one first. Then we\u2019ll explore why they\u2019re regarded as the gold standard.<\/p>\n<p>Picking up with the example I used <a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2018\/04\/16\/causal-inference-in-ecology-counterfactuals\/\">last time<\/a>, let\u2019s suppose we\u2019re trying to test the hypothesis that applying nitrogen fertilizer increases the yield of corn.<sup><a id=\"ffn2\" class=\"footnote\" href=\"#fn2\">2<\/a><\/sup> As I pointed out, in setting up our experiment, we\u2019d seek to control for every variable that could influence corn yield so that we can isolate the effect of nitrogen. In the simplest possible case, we\u2019d have two adjacent plots in a field that have been plowed and tilled thoroughly so that the soil in the two plots is completely mixed and indistinguishable in every way &#8211; same content of nitrogen, phosphorous, other macronutrients, other micronutrients; same soil texture; same percent of (the same kind of) soil organic matter; same composition of clay, silt, and sand; everything.<sup><a id=\"ffn3\" class=\"footnote\" href=\"#fn3\">3<\/a><\/sup> We\u2019d also have plants that were genetically uniform (or as genetically uniform as we can make them), either highly inbred lines or an F1 cross produced between two highly inbred lines. We\u2019d make sure the field was level, maybe using high-tech laser leveling devices, and we\u2019d make sure that every plant in the entire field received the same amount of water. Since we know that the microclimate at the perimeter of the field is different from in the middle of the field, we\u2019d make the field big enough that we could focus our measurements on a part of the field isolated from these edge effects. Then we\u2019d randomly choose one side of the field to be the \u201clow N\u201d treatment and the other to be the \u201chigh N\u201d treatment.<sup><a id=\"ffn4\" class=\"footnote\" href=\"#fn4\">4<\/a><\/sup> After allowing the plants to grow for an appropriate amount of time, we\u2019d harvest them, dry them, and weigh them.<\/p>\n<p>Our hypothesis has the form<\/p>\n<blockquote><p>If N is applied to a corn field, then the yield will be greater than if it had not been applied.<\/p><\/blockquote>\n<p>Notice that we can\u2019t both apply N and not apply N to the same set of plants. We have to compare what happens when we apply N to one set of plants and don\u2019t apply it to another. If we find that the \u201chigh N\u201d plants have a greater yield than the \u201clow N\u201d plants, we infer that the \u201clow N\u201d plants would also have had a greater yield <em><strong>if we had applied N to them<\/strong><\/em> (which we didn\u2019t). Why is that justified? Because everything about the two treatments is identical, by design, except for the amount of N applied. If there\u2019s a difference in yield, it can only be attributed to something that differs between the treatments, and the only thing that differs is the amount of N applied.<\/p>\n<p>I can hear you thinking, \u201cCouldn\u2019t the difference just be due to chance?\u201d Well, yes it could. If we do a statistical test and demonstrate that the yields are statistically distinguishable, that increases our confidence that the difference in yield is real, but nothing can ever make the conclusion logically certain in the way we can be logically certain that 2+2=4.<sup><a id=\"ffn5\" class=\"footnote\" href=\"#fn5\">5<\/a><\/sup> To my mind there are two things that make us accept the outcome of this experiment as evidence that applying N increases corn yield:<\/p>\n<ol>\n<li>It\u2019s not just this experiment. If the same experiment is repeated in different places with different soil types, different corn genotypes, and different weather patterns, we get the same result. We can never be certain, but the consistency of that result increases our confidence that the association isn\u2019t just a fluke.<\/li>\n<li>What we understand about plant growth and physiology leads us to expect that providing nitrogen in fertilizer should enhance plant growth. In other words, this particular hypothesis is part of a larger theoretical framework about plant physiology and development. That framework provides a coherent and repeatable set of predictions across a wide empirical domain.<\/li>\n<\/ol>\n<p>Put those two together, and we have good reason for thinking that the <em><strong>observed<\/strong><\/em> association between N fertilizer and corn yield is actually a <em><strong>causal<\/strong><\/em> association.<\/p>\n<p>In experiments where we can\u2019t completely control all relevant variables except the one that we\u2019re interested in, we rely on randomization. Suppose, for example, we couldn\u2019t produce genetically uniform corn. Then we\u2019d randomize the assignment of individuals to the \u201chigh\u201d and \u201clow\u201d treatments. The results aren\u2019t quite as solid as if we\u2019d had complete uniformity. It\u2019s always possible that by some statistical fluke a factor we aren\u2019t measuring ends up overrepresented in one treatment and underrepresented in the other, but if we\u2019ve randomized well <em><strong>and we have a reasonably large sample<\/strong><\/em>, the chances are small. So our inference isn\u2019t quite as firm, but it\u2019s still pretty goo.<\/p>\n<p>We\u2019ll explore the \u201creasonably large sample question\u201d in the next installment.<\/p>\n<ol id=\"footnotes\">\n<li id=\"fn1\">See, for example, Rubin (<em>Annals of Applied Statistics<\/em> 2:808-840; 2008. <a href=\"https:\/\/projecteuclid.org\/euclid.aoas\/1223908042\">https:\/\/projecteuclid.org\/euclid.aoas\/1223908042<\/a>) <a href=\"#ffn1\">&#x21a9;<\/a><\/li>\n<li id=\"fn2\">If you know me or my work, you know that I\u2019m not at all crazy about the null hypothesis testing approach to investigating ecology. We\u2019ll get to that later, but let\u2019s start with a simple case. Even those of us who don\u2019t like null hypothesis testing as a general approach recognize that it has value. We\u2019ll focus on one way in which it has value here. <a href=\"#ffn2\">&#x21a9;<\/a><\/li>\n<li id=\"fn3\">If we were really fastidious we might even set up the experiment in a large growth chamber in which we mixed the soil together and distributed it evenly ourselves. <a href=\"#ffn3\">&#x21a9;<\/a><\/li>\n<li id=\"fn4\">If we were really paranoid about controlling for all possible factors, we\u2019d even randomly assign a nitrogen fertilizer level (high or low) to every different plant in the field, and we\u2019d probably do the whole experiment in a very large growth chamber where we could mix the soil ourselves and ensure that light, humidity, and temperature were as uniform as possible across all individuals in the experiment. <a href=\"#ffn4\">&#x21a9;<\/a><\/li>\n<li id=\"fn5\">If you don\u2019t see why, Google \u201cproblem of induction\u201d and you\u2019ll get some idea. If that doesn\u2019t satisfy you, ask, and I\u2019ll see what I can do to provide an explanation. <a href=\"#ffn5\">&#x21a9;<\/a><\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>Causal inference in ecology &#8211; links to the series Randomized controlled experiments are generally regarded as the gold standard for identifying a causal factor.1 Let\u2019s describe a really simple one&#8230; <a class=\"read-more-button\" href=\"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2018\/04\/23\/causal-inference-in-ecology-controlled-experiments\/\">Read more &gt;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-583","post","type-post","status-publish","format-standard","hentry","category-statistics"],"_links":{"self":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/583","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/comments?post=583"}],"version-history":[{"count":4,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/583\/revisions"}],"predecessor-version":[{"id":587,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/583\/revisions\/587"}],"wp:attachment":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/media?parent=583"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/categories?post=583"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/tags?post=583"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}