Variable selection in multiple regression
The Lasso has been very widely used, particularly in high-dimensional problems where the number of observations is less than the number of covariates.1 In fact, when I checked Google Scholar on Saturday, it had been cited nearly 30,000 times.2 Bayesians didn’t want to be left out, so Trevor Park and George Casella developed the Bayesian Lasso.3 The Bayesian Lasso overcomes what to my mind is one of the great disadvantages of the original Lasso, the difficulty of providing an assessment of how reliable the regression coefficients are. Like any other Bayesian method that uses MCMC methods, it’s just as easy to get credible intervals on parameters as it is to get posterior means. The Bayesian Lasso also estimates \(\lambda\) as part of the procedure rather than relying on cross-validation. The R
package monovm
provides an implementation of the Bayesian Lasso in addition to other shrinkage regression methods.
I haven’t explored monovm
, but if you’re interested in the Bayesian Lasso, you might want to check it out. Instead of exploring the Bayesian Lasso, the R notebook I’ve put together here explores the use of “horseshoe priors” in rstanarm
. The basic idea is the same. We’d like to “shrink” some parameter estimates towards zero, and we’d like to have the data tell us which estimates to shrink. The nice thing about “horseshoe priors” in rstanarm
is that if you know how to set up a regression in stan_glm()
or stan_glmer()
you can use a horseshoe prior very easily in your analysis simply by changing the prior
parameter in your call to one of those functions.
- This is often referred to as an \(n \ll p\) problem. I’m not going to address that problem here, but if you deal with genomic data, you’ll want to familiarize yourself with the problem and the approaches typically used for addressing it. ↩
- 29,558 times to be exact. ↩
- Park, T. and G. Casella. 2008. The Bayesian Lasso. Jornal of the American Statistical Association. 103:681-686. doi: 10.1198/016214508000000337↩