Bayesian predictive inference provides a coherent description of entire predictive uncertainty through predictive distributions. We examine several widely used sparsity priors from the predictive (as opposed to estimation) inference viewpoint. To start, we investigate predictive distributions in the context of a high-dimensional Gaussian observation with a known variance but an unknown sparse mean under the Kullback-Leibler loss. First, we show that LASSO (Laplace) priors are incapable of achieving rate-optimal predictive distributions. However, deploying the Laplace prior inside the Spike-and-Slab framework (e.g. with the Spike-and-Slab LASSO prior), rate-minimax performance can be attained with properly tuned parameters (depending on the sparsity level sn). We highlight the discrepancy between prior calibration for the purpose of prediction and estimation. Going further, we investigate popular hierarchical priors which are known to attain adaptive rate-minimax performance for estimation. Whether or not they are rate-minimax also for predictive inference has, until now, been unclear. We answer affirmatively by showing that hierarchical Spike-and-Slab priors are adaptive and attain the minimax rate without the knowledge of sn. This is the first rate-adaptive result in the literature on predictive density estimation in sparse setups. Building on the sparse normal-means model, we extend our adaptive rate findings to the case of sparse high-dimensional regression with Spike-and-Slab priors. All of these results underscore benefits of fully Bayesian predictive inference.
翻译:贝叶斯预测推断通过预测分布为整个预测不确定性提供了一致的描述。我们从预测(而非估计)推断的视角考察了几种广泛使用的稀疏先验。首先,我们在已知方差但均值稀疏未知的高斯观测背景下,研究Kullback-Leibler损失下的预测分布。我们首先证明,LASSO(拉普拉斯)先验无法获得速率最优的预测分布。然而,在Spike-and-Slab框架内采用拉普拉斯先验(例如使用Spike-and-Slab LASSO先验),通过适当调整参数(取决于稀疏水平sn),可以达到速率极小极大性能。我们强调了为预测目的与为估计目的进行先验校准之间的差异。进一步地,我们研究了已知在估计中能达到自适应速率极小极大性能的流行分层先验。它们是否在预测推断中也具有速率极小极大性,此前尚不明确。我们给出了肯定回答,证明分层Spike-and-Slab先验具有自适应性,且无需知晓sn即可达到极小极大速率。这是稀疏设定下预测密度估计文献中首个速率自适应结果。基于稀疏正态均值模型,我们将自适应速率结果推广到采用Spike-and-Slab先验的稀疏高维回归情形。所有这些结果都凸显了完全贝叶斯预测推断的优势。