Modern approaches to perform Bayesian variable selection rely mostly on the use of shrinkage priors. That said, an ideal shrinkage prior should be adaptive to different signal levels, ensuring that small effects are ruled out, while keeping relatively intact the important ones. With this task in mind, we develop the nonparametric Bayesian Lasso, an adaptive and flexible shrinkage prior for Bayesian regression and variable selection, particularly useful when the number of predictors is comparable or larger than the number of available data points. We build on spike-and-slab Lasso ideas and extend them by placing a Dirichlet Process prior on the shrinkage parameters. The result is a prior on the regression coefficients that can be seen as an infinite mixture of Double Exponential densities, all offering different amounts of regularization, ensuring a more adaptive and flexible shrinkage. We also develop an efficient Markov chain Monte Carlo algorithm for posterior inference. Through various simulation exercises and real-world data analyses, we demonstrate that our proposed method leads to a better recovery of the true regression coefficients, a better variable selection, and better out-of-sample predictions, highlighting the benefits of the nonparametric Bayesian Lasso over existing shrinkage priors.
翻译:现代贝叶斯变量选择方法主要依赖于收缩先验的使用。然而,理想的收缩先验应能自适应于不同的信号水平,确保排除微弱效应,同时相对完整地保留重要效应。基于此目标,我们提出了非参数贝叶斯Lasso——一种自适应且灵活的收缩先验,适用于贝叶斯回归与变量选择,尤其当预测变量数量与可用数据点数量相当或更多时效果显著。我们基于尖峰-平板Lasso的思想进行扩展,通过对收缩参数施加狄利克雷过程先验来实现。其结果是回归系数的先验可视为无限混合的双指数密度,每种密度提供不同程度的正则化,从而确保更自适应和灵活的收缩。我们还开发了一种高效的马尔可夫链蒙特卡洛算法用于后验推断。通过多种模拟实验和真实世界数据分析,我们证明所提出的方法能更好地恢复真实回归系数、实现更优的变量选择并提升样本外预测性能,凸显了非参数贝叶斯Lasso相对于现有收缩先验的优势。