Bayesian variable selection methods are powerful techniques for fitting and inferring on sparse high-dimensional linear regression models. However, many are computationally intensive or require restrictive prior distributions on model parameters. In this paper, we proposed a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are required through the use of plug-in empirical Bayes estimates of hyperparameters. Efficient maximum a posteriori (MAP) estimation is completed through a Parameter-Expanded Expectation-Conditional-Maximization (PX-ECM) algorithm. The PX-ECM results in a robust computationally efficient coordinate-wise optimization which -- when updating the coefficient for a particular predictor -- adjusts for the impact of other predictor variables. The completion of the E-step uses an approach motivated by the popular two-group approach to multiple testing. The result is a PaRtitiOned empirical Bayes Ecm (PROBE) algorithm applied to sparse high-dimensional linear regression, which can be completed using one-at-a-time or all-at-once type optimization. We compare the empirical properties of PROBE to comparable approaches with numerous simulation studies and analyses of cancer cell drug responses. The proposed approach is implemented in the R package probe.
翻译:贝叶斯变量选择方法是对稀疏高维线性回归模型进行拟合与推断的强大技术,但许多方法计算强度高或需要对模型参数设定严格先验分布。本文提出了一种计算高效且性能强大的稀疏高维线性回归贝叶斯方法。通过使用超参数的经验贝叶斯插值估计,对参数仅需极少的先验假设。通过参数扩展期望条件最大化(PX-ECM)算法完成高效的最大后验(MAP)估计。PX-ECM算法实现了稳健且计算高效的坐标优化,在更新特定预测变量的系数时,会自动调整其他预测变量的影响。E步的完成采用了由流行的多检验二组方法启发而来的策略。由此产生的应用于稀疏高维线性回归的分区经验贝叶斯ECM(PROBE)算法,可通过逐个或批量优化方式实现。我们通过大量仿真研究和癌细胞药物响应分析,将PROBE的经验特性与同类方法进行了对比。所提方法已在R包probe中实现。