Bayesian predictive inference provides a coherent description of entire predictive uncertainty through predictive distributions. We examine several widely used sparsity priors from the predictive (as opposed to estimation) inference viewpoint. Our context is estimating a predictive distribution of a high-dimensional Gaussian observation with a known variance but an unknown sparse mean under the Kullback-Leibler loss. First, we show that LASSO (Laplace) priors are incapable of achieving rate-optimal performance. This new result contributes to the literature on negative findings about Bayesian LASSO posteriors. However, deploying the Laplace prior inside the Spike-and-Slab framework (for example with the Spike-and-Slab LASSO prior), rate-minimax performance can be attained with properly tuned parameters (depending on the sparsity level sn). We highlight the discrepancy between prior calibration for the purpose of prediction and estimation. Going further, we investigate popular hierarchical priors which are known to attain adaptive rate-minimax performance for estimation. Whether or not they are rate-minimax also for predictive inference has, until now, been unclear. We answer affirmatively by showing that hierarchical Spike-and-Slab priors are adaptive and attain the minimax rate without the knowledge of sn. This is the first rate-adaptive result in the literature on predictive density estimation in sparse setups. This finding celebrates benefits of fully Bayesian inference.
翻译:贝叶斯预测推断通过预测分布对整个预测不确定性提供连贯描述。我们从预测推断(而非估计推断)视角考察了几种广泛使用的稀疏先验。研究背景是在库尔巴克-莱布勒损失下,估计具有已知方差但未知稀疏均值的高维高斯观测的预测分布。首先,我们证明LASSO(拉普拉斯)先验无法实现速率最优性能。这一新发现补充了关于贝叶斯LASSO后验负面结果的研究文献。然而,在尖峰-平板框架内使用拉普拉斯先验(例如尖峰-平板LASSO先验),通过适当调整参数(取决于稀疏水平sn)可以实现极小化最大速率性能。我们重点指出了预测与估计目的的先验校准差异。进一步地,我们研究了已知可实现自适应速率最优估计性能的流行分层先验。此前尚不清楚这些先验在预测推断中是否也能实现极小化最大速率。我们通过证明分层尖峰-平板先验具有自适应性且无需知晓sn即可达到极小化最大速率,给出了肯定回答。这是稀疏场景下预测密度估计文献中首个速率自适应结果。这一发现彰显了完全贝叶斯推断的优势。