{"id":841,"date":"2019-09-30T20:00:00","date_gmt":"2019-10-01T00:00:00","guid":{"rendered":"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/?p=841"},"modified":"2019-09-28T15:31:04","modified_gmt":"2019-09-28T19:31:04","slug":"some-parting-thoughts-on-variable-selection-in-multiple-regression","status":"publish","type":"post","link":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/09\/30\/some-parting-thoughts-on-variable-selection-in-multiple-regression\/","title":{"rendered":"Some parting thoughts on variable selection in multiple regression"},"content":{"rendered":"\r\n<p><a href=\"#\">Variable selection in multiple regression<\/a><\/p>\r\n\r\n\r\n\r\n<p>As I said <a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/08\/12\/collecting-my-thoughts-about-variable-selection-in-multiple-regression\/\">a month and a half ago<\/a>, this series started because<\/p>\r\n\r\n\r\n\r\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\r\n<p>I was talking with one of my graduate students a few days ago about variable selection in multiple regression. She was looking for a published \u201ccheat sheet.\u201d I told her I didn\u2019t know of any. \u201cWhy don\u2019t you write one?\u201d \u201cThe world\u2019s too complicated for that. There will always be judgment involved. There will never be a simple recipe to follow.\u201d<\/p>\r\n<\/blockquote>\r\n\r\n\r\n\r\n<p>If you\u2019ve been following along, it won\u2019t surprise you to learn that I\u2019m not going to conclude with a simple recipe.<\/p>\r\n\r\n\r\n\r\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\r\n<p>\u201cThe world\u2019s too complicated for that. There will always be judgment involved. There will never be a simple recipe to follow.\u201d<\/p>\r\n<\/blockquote>\r\n\r\n\r\n\r\n<p>Although there will never be a simple recipe, I can tell you what I\u2019m going to do. You\u2019ll want to look at <a href=\"http:\/\/darwin.eeb.uconn.edu\/pages\/variable-selection\/smaller-association.nb.html\">a new R notebook<\/a> that explores what happens when associations among covariates aren\u2019t as strong as those we\u2019ve been assuming so far.<\/p>\r\n\r\n\r\n\r\n<ul class=\"wp-block-list\">\r\n<li>For any analysis where I can use <code>stan_glm()<\/code> or <code>stan_glmer()<\/code> I\u2019ll use <a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/09\/16\/a-bayesian-approach-to-variable-selection-using-horseshoe-priors\/\">horseshoe priors<\/a> to \u201cshrink\u201d the regression coefficients for unimportant covariates towards zero. For any analysis where I can\u2019t use <code>stan_glm()<\/code> or <code>stan_glmer()<\/code>, I\u2019ll probably be using<a href=\"https:\/\/mc-stan.org\/\"> <code>Stan<\/code> <\/a>directly and I\u2019ll hardcode the horseshoe priors myself.<\/li>\r\n<li>If I feel the need to use a relatively objective method to identify some subset of covariates that are \u201cimportant\u201d,<sup><a id=\"ffn1\" class=\"footnote\" href=\"#fn1\">1<\/a><\/sup> I\u2019ll use <a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/09\/23\/using-projection-predictiion-for-variable-selection-in-a-bayesian-regression\/\">projection predictive variable selection<\/a> as implemented in <code>projpred<\/code> to identify the most important covariates.<sup><a id=\"ffn2\" class=\"footnote\" href=\"#fn2\">2<\/a><\/sup><\/li>\r\n<li>For reasons outlined in the <a href=\"http:\/\/darwin.eeb.uconn.edu\/pages\/variable-selection\/smaller-association.nb.html#conclusions\">Conclusions<\/a> section of the R notebook I mentioned above, I will be <em><strong>very<\/strong><\/em> cautious about interpreting associations between covariates and response variables as anything other than a statistical association. Only if an association I find has been found repeatedly in other data sets and also has a good \u201cfirst principles\u201d explanation will I begin to interpret it as a causal association. Otherwise, I\u2019ll interpret it as an intriguing pattern worthy of further study and exploration. If you want more details on how hard it is to infer causal relationships from these kinds of analyses, look at my blog series on <a href=\"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/causal-inference-in-ecology\/\">causal inference in ecology<\/a>.<\/li>\r\n<\/ul>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li id=\"fn1\">I can think of a couple of reasons that I might want to select a subset of covariates. (1) I might not have a lot of data to fit to my model. Because of the priors, I\u2019ll be able to fit it without the model blowing up, but the parameter estimates are likely to be very poorly defined. Reducing the number of parameters may help me isolate an \u201cinteresting\u201d relationship. So long as I remember that all I may uncover is a statistical association, that pattern might still be worth investigating. (2) I might want a relatively objective way to simplify a complicated model so that it\u2019s easy to understand, and there may not be an obvious break graphically or numerically between those that are important and those that aren\u2019t. Regardless of whether it\u2019s for reason #1 or reason #2, I will be <em><strong>extremely<\/strong><\/em> cautious about interpreting any associations identified through projection predictive variable selection as \u201creal\u201d and the ones not identified as \u201cspurious.\u201d In fact, I probably won\u2019t do it at all, and I\u2019ll probably present results from analysis of the full model in addition to the reduced model, even if the full model results only go in online supplemental material. <a href=\"#ffn1\">&#x21a9;<\/a><\/li>\r\n<li id=\"fn2\">Since I\u2019m new to using <code>projpred<\/code>, I don\u2019t know whether I\u2019ll be able to use it with my own Stan code. If not, it\u2019s yet another reason for me to learn <a href=\"https:\/\/cran.r-project.org\/web\/packages\/brms\/index.html\"><code>brms<\/code><\/a>, which can handle a bunch of models that <code>rstanarm<\/code> can\u2019t &#8211; possibly many of them that I\u2019d be hardcoding in Stan otherwise. <a href=\"#ffn2\">&#x21a9;<\/a><\/li>\r\n<\/ol>\r\n","protected":false},"excerpt":{"rendered":"<p>Variable selection in multiple regression As I said a month and a half ago, this series started because I was talking with one of my graduate students a few days&#8230; <a class=\"read-more-button\" href=\"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/09\/30\/some-parting-thoughts-on-variable-selection-in-multiple-regression\/\">Read more &gt;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-841","post","type-post","status-publish","format-standard","hentry","category-statistics"],"_links":{"self":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/841","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/comments?post=841"}],"version-history":[{"count":3,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/841\/revisions"}],"predecessor-version":[{"id":845,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/841\/revisions\/845"}],"wp:attachment":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/media?parent=841"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/categories?post=841"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/tags?post=841"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}