Bayesian High-dimensional Linear Regression with Sparse Projection-posterior

We consider a novel Bayesian approach to estimation, uncertainty quantification, and variable selection for a high-dimensional linear regression model under sparsity. The number of predictors can be nearly exponentially large relative to the sample size. We put a conjugate normal prior initially disregarding sparsity, but for making an inference, instead of the original multivariate normal posterior, we use the posterior distribution induced by a map transforming the vector of regression coefficients to a sparse vector obtained by minimizing the sum of squares of deviations plus a suitably scaled $\ell_1$-penalty on the vector. We show that the resulting sparse projection-posterior distribution contracts around the true value of the parameter at the optimal rate adapted to the sparsity of the vector. We show that the true sparsity structure gets a large sparse projection-posterior probability. We further show that an appropriately recentred credible ball has the correct asymptotic frequentist coverage. Finally, we describe how the computational burden can be distributed to many machines, each dealing with only a small fraction of the whole dataset. We conduct a comprehensive simulation study under a variety of settings and found that the proposed method performs well for finite sample sizes. We also apply the method to several real datasets, including the ADNI data, and compare its performance with the state-of-the-art methods. We implemented the method in the \texttt{R} package called \texttt{sparseProj}, and all computations have been carried out using this package.

翻译：本文提出了一种新颖的贝叶斯方法，用于稀疏性假设下的高维线性回归模型的估计、不确定性量化和变量选择。预测变量的数量相对于样本量可以接近指数级增长。我们首先放置一个忽略稀疏性的共轭正态先验，但在进行推断时，不使用原始多元正态后验，而是采用由一种映射诱导的后验分布：该映射将回归系数向量转换为通过最小化偏差平方和加上对该向量适当缩放的 $\ell_1$ 惩罚项而得到的稀疏向量。我们证明，所得的稀疏投影后验分布以最优速率（适应于向量的稀疏性）收缩到参数的真实值附近。我们证明真实的稀疏结构获得较大的稀疏投影后验概率。我们进一步证明，经过适当重新中心化的可信球具有正确的渐近频率主义覆盖率。最后，我们描述了如何将计算负担分配到多台机器上，每台机器仅处理整个数据集的一小部分。我们在多种设置下进行了全面的模拟研究，发现所提出的方法在有限样本量下表现良好。我们还将该方法应用于多个真实数据集，包括 ADNI 数据，并将其性能与最先进的方法进行了比较。我们在名为 \texttt{sparseProj} 的 \texttt{R} 包中实现了该方法，所有计算均使用此包完成。