Bayesian predictive inference provides a coherent description of entire predictive uncertainty through predictive distributions. We examine several widely used sparsity priors from the predictive (as opposed to estimation) inference viewpoint. Our context is estimating a predictive distribution of a high-dimensional Gaussian observation with a known variance but an unknown sparse mean under the Kullback-Leibler loss. First, we show that LASSO (Laplace) priors are incapable of achieving rate-optimal performance. This new result contributes to the literature on negative findings about Bayesian LASSO posteriors. However, deploying the Laplace prior inside the Spike-and-Slab framework (for example with the Spike-and-Slab LASSO prior), rate-minimax performance can be attained with properly tuned parameters (depending on the sparsity level sn). We highlight the discrepancy between prior calibration for the purpose of prediction and estimation. Going further, we investigate popular hierarchical priors which are known to attain adaptive rate-minimax performance for estimation. Whether or not they are rate-minimax also for predictive inference has, until now, been unclear. We answer affirmatively by showing that hierarchical Spike-and-Slab priors are adaptive and attain the minimax rate without the knowledge of sn. This is the first rate-adaptive result in the literature on predictive density estimation in sparse setups. This finding celebrates benefits of fully Bayesian inference.
翻译:贝叶斯预测推断通过预测分布为整体预测不确定性提供了一致的描述。我们从预测(而非估计)推断的视角审视了几种广泛使用的稀疏性先验。研究背景是在Kullback-Leibler损失下,估计具有已知方差但未知稀疏均值的高维高斯观测的预测分布。首先,我们表明LASSO(拉普拉斯)先验无法达到速率最优性能。这一新结果为关于贝叶斯LASSO后验的负面发现文献提供了补充。然而,在Spike-and-Slab框架内使用拉普拉斯先验(例如采用Spike-and-Slab LASSO先验),通过适当调整参数(依赖于稀疏度水平sn),可以达到速率极小化最优性能。我们强调了为预测和估计目的进行先验校准之间的差异。进一步地,我们研究了已知能在估计中实现自适应速率极小化最优性能的流行分层先验。这些先验在预测推断中是否也具有速率极小化最优性,此前尚不明确。我们通过证明分层Spike-and-Slab先验具有自适应性且无需知晓sn即可达到极小化最优速率,给出了肯定回答。这是稀疏场景下预测密度估计文献中首个速率自适应结果。这一发现彰显了完全贝叶斯推断的优势。