We study the asymptotic frequentist coverage of credible sets based on a novel Bayesian approach for a multiple linear regression model under variable selection. We initially ignore the issue of variable selection, which allows us to put a conjugate normal prior on the coefficient vector. The variable selection step is incorporated directly in the posterior through a sparsity-inducing map and uses the induced prior for making an inference instead of the natural conjugate posterior. The sparsity-inducing map minimizes the sum of the squared l2-distance weighted by the data matrix and a suitably scaled l1-penalty term. We obtain the limiting coverage of various credible regions and demonstrate that a modified credible interval for a component has the exact asymptotic frequentist coverage if the corresponding predictor is asymptotically uncorrelated with other predictors. Through extensive simulation, we provide a guideline for choosing the penalty parameter as a function of the credibility level appropriate for the corresponding coverage. We also show finite-sample numerical results that support the conclusions from the asymptotic theory. We also provide the credInt package that implements the method in R to obtain the credible intervals along with the posterior samples.
翻译:我们研究了一种基于新型贝叶斯方法的多重线性回归模型在变量选择下可信集的渐近频率派覆盖性。我们首先忽略变量选择问题,从而可以对系数向量施加共轭正态先验。变量选择步骤通过稀疏诱导映射直接纳入后验分布,并利用诱导先验而非自然共轭后验进行统计推断。该稀疏诱导映射通过最小化数据矩阵加权的平方l2距离与适当缩放l1惩罚项之和来实现。我们获得了各类可信区域的极限覆盖性质,并证明当预测变量渐近不相关时,针对分量的修正可信区间具有精确的渐近频率派覆盖性。通过大量模拟实验,我们提出了根据与相应覆盖性匹配的可信度水平选择惩罚参数的指导原则。同时展示了支持渐近理论结论的有限样本数值结果。我们还开发了credInt软件包,在R语言中实现了该方法以获取可信区间及后验样本。