{"id":780,"date":"2019-08-19T08:00:00","date_gmt":"2019-08-19T12:00:00","guid":{"rendered":"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/?p=780"},"modified":"2019-08-17T12:07:02","modified_gmt":"2019-08-17T16:07:02","slug":"challenges-of-multiple-regression-or-why-we-might-want-to-select-variables","status":"publish","type":"post","link":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/08\/19\/challenges-of-multiple-regression-or-why-we-might-want-to-select-variables\/","title":{"rendered":"Challenges of multiple regression (or why we might want to select variables)"},"content":{"rendered":"\r\n<p><a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/variable-selection-in-multiple-regression\/\">Variable selection in multiple regression<\/a><\/p>\r\n\r\n\r\n\r\n<p>We saw in the <a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/08\/12\/collecting-my-thoughts-about-variable-selection-in-multiple-regression\/\">first installment in this series<\/a> that multiple regression may allow us to distinguish \u201creal\u201d from \u201cspurious\u201d associations among variables. Since it worked so effectively in the example we studied, you might wonder why you would ever want to reduce the number of covariates in a multiple regression.<\/p>\r\n\r\n\r\n\r\n<p>Why not simply throw in everything you\u2019ve measured and let the multiple regression sort things out for you? There are at least a couple of reasons:<\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li>When you have covariates that are highly correlated, the associations that are strongly supported may not be the ones that are \u201creal\u201d. In other words, if you\u2019re using multiple regression in an attempt to identify the \u201cimportant\u201d covariates, you may identify the wrong ones.<\/li>\r\n<li>When you have covariates that are highly correlated, any attempt to extrapolate predictions beyond the range of covariates that you\u2019ve measured may be misleading. This is especially true if you fit a linear regression and the true relationship is curvilinear.<sup><a id=\"ffn1\" class=\"footnote\" href=\"#fn1\">1<\/a><\/sup><\/li>\r\n<\/ol>\r\n\r\n\r\n\r\n<p>This <a href=\"http:\/\/darwin.eeb.uconn.edu\/pages\/variable-selection\/challenges-of-multiple-regression.nb.html\">R notebook<\/a> explores both of these points using the same set of deterministic relationships we\u2019ve used before to generate the data, but increasing the residual variance.<sup><a id=\"ffn2\" class=\"footnote\" href=\"#fn2\">2<\/a><\/sup><\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li id=\"fn1\">The R notebook linked here doesn\u2019t explore the problem of extrapolation when the true relationship is curvilinear, but if you\u2019ve been following along and you have a reasonable amount of facility with R, you shouldn\u2019t find it hard to explore that on your own. <a href=\"#ffn1\">&#x21a9;<\/a><\/li>\r\n<li id=\"fn2\">The R-squared in our initial example was greater than 0.99. That\u2019s why multiple regression worked so well. The example you\u2019ll see here has an R-squared of \u201conly\u201d 0.42 (adjusted 0.36). The \u201conly\u201d is in quotes because in many analyses in ecology an evolution, an R-squared that large would seem pretty good. <a href=\"#ffn2\">&#x21a9;<\/a><\/li>\r\n<\/ol>\r\n","protected":false},"excerpt":{"rendered":"<p>Variable selection in multiple regression We saw in the first installment in this series that multiple regression may allow us to distinguish \u201creal\u201d from \u201cspurious\u201d associations among variables. Since it&#8230; <a class=\"read-more-button\" href=\"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/08\/19\/challenges-of-multiple-regression-or-why-we-might-want-to-select-variables\/\">Read more &gt;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-780","post","type-post","status-publish","format-standard","hentry","category-statistics"],"_links":{"self":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/780","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/comments?post=780"}],"version-history":[{"count":4,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/780\/revisions"}],"predecessor-version":[{"id":783,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/780\/revisions\/783"}],"wp:attachment":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/media?parent=780"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/categories?post=780"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/tags?post=780"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}