{"id":816,"date":"2019-09-09T08:00:00","date_gmt":"2019-09-09T12:00:00","guid":{"rendered":"http:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/?p=816"},"modified":"2019-09-07T08:34:02","modified_gmt":"2019-09-07T12:34:02","slug":"using-the-lasso-for-variable-selection","status":"publish","type":"post","link":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/09\/09\/using-the-lasso-for-variable-selection\/","title":{"rendered":"Using the Lasso for variable selection"},"content":{"rendered":"\r\n<p class=\"wp-block-paragraph\"><a href=\"#\">Variable selection in multiple regression<\/a><\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">If you\u2019ve been following along, you\u2019ve now seen some fairly simple approaches for reducing the number of covariates in a linear regression. It shouldn\u2019t come as a shock that statisticians have been worried about the problem for a long time or that they\u2019ve come up with some pretty sophisticated approaches to the problem.<sup><a id=\"ffn1\" class=\"footnote\" href=\"#fn1\">1<\/a><\/sup> The first one we\u2019ll explore is the Lasso (<strong>l<\/strong>east <strong>a<\/strong>solute <strong>s<\/strong>hrinkage and <strong>s<\/strong>election <strong>o<\/strong>perator), which Rob Tibshirani introduced the Lasso to statistics and machine learning more than 20 years ago.<sup><a id=\"ffn2\" class=\"footnote\" href=\"#fn2\">2<\/a><\/sup> You\u2019ll find more details in the R notebook illustrating <a href=\"http:\/\/darwin.eeb.uconn.edu\/pages\/variable-selection\/using-the-lasso.nb.html\">using the Lasso to select covariates<\/a>, but here are the basic ideas.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The \u201cshrinkage\u201d part of the name refers to the idea that we don\u2019t expect all of the covariates we\u2019re including in the model to be important. And if a covariate isn\u2019t important, we want the magnitude of the regression coefficient associated with that component to be zero (or nearly zero). In other words, we want the estimate to be \u201cshrunk\u201d towards zero rather than taking the value it would if we included it in the full multiple regression.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The \u201cselection\u201d part of the name refers to the idea that we don\u2019t know ahead of time which of the covariates are important (and shouldn\u2019t be shrunk towards 0) and which are important (and should be shrunk towards 0). We want the data to tell us which covariates are important and which aren\u2019t, i.e., we want the data to \u201cselect\u201d important covariates.<\/p>\r\n\r\n\r\n\r\n<p class=\"wp-block-paragraph\">The Lasso accomplishes this by adding a penalty to the typical least squares estimates. Instead of simply minimizing the sum of squared deviations from the regression line, we do so subject to a constraint that the total magnitude of all regression coefficients is less than some value. We\u2019ll use <code>glmnet()<\/code> to fit the Lasso. If you explore the accompanying documentation, you\u2019ll see that the Lasso is just one method along a continuum of constrained optimization approaches. I\u2019ll let you explore those on your own if you\u2019re interested.<\/p>\r\n\r\n\r\n\r\n<ol class=\"wp-block-list\">\r\n<li id=\"fn1\">I\u2019m not going to discuss forward, backward, or all subsets approaches to selecting variables. They don\u2019t seem to be used much anymore (for good reason). If you\u2019re interested in them, take a look at the Wikipedia page on <a href=\"https:\/\/en.wikipedia.org\/wiki\/Stepwise_regression\">stepwise regression<\/a>. <a href=\"#ffn1\">&#x21a9;<\/a><\/li>\r\n<li id=\"fn2\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Lasso_(statistics)\">Wikipedia<\/a> points out that it was originally introduced 10 years earlier in geophysics, but Tibshirani discovered it independently, and it was his discovery that led to its wide use in statistics and machine learning. <a href=\"#ffn2\">&#x21a9;<\/a><\/li>\r\n<\/ol>\r\n","protected":false},"excerpt":{"rendered":"<p>Variable selection in multiple regression If you\u2019ve been following along, you\u2019ve now seen some fairly simple approaches for reducing the number of covariates in a linear regression. It shouldn\u2019t come&#8230; <a class=\"read-more-button\" href=\"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/blog\/2019\/09\/09\/using-the-lasso-for-variable-selection\/\">Read more &gt;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-816","post","type-post","status-publish","format-standard","hentry","category-statistics"],"_links":{"self":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/816","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/comments?post=816"}],"version-history":[{"count":2,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/816\/revisions"}],"predecessor-version":[{"id":818,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/posts\/816\/revisions\/818"}],"wp:attachment":[{"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/media?parent=816"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/categories?post=816"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/darwin.eeb.uconn.edu\/uncommon-ground\/wp-json\/wp\/v2\/tags?post=816"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}