The empirical Bayes $g$-modeling approach based on the nonparametric maximum likelihood estimator (NPMLE) has been central to large-scale estimation and inference in the normal means problem. However, theoretical guarantees for uncertainty quantification remain scarce. A key obstacle is that the NPMLE is necessarily discrete, which yields discrete posterior credible sets and a slow logarithmic deconvolution rate. We address both limitations by introducing a hierarchical Gaussian smoothing layer that restricts the mixing distribution to a Gaussian location mixture. Our smooth NPMLE inherits the favorable properties of the classical NPMLE: it is computable via convex optimization and achieves nearly parametric denoising performance. Moreover, it achieves a polynomial deconvolution rate that is asymptotically minimax over the corresponding class. Our procedure also leads to estimated smooth posteriors that converge to the true posteriors at a polynomial rate. Further, we characterize marginal coverage sets that are optimal in expected length, construct plug-in estimators of these sets, and establish theoretical guarantees for the estimated sets in terms of both coverage probability and expected length. We also extend the theory to settings with model misspecification and heteroscedastic Gaussian observations, and study identifiability of the proposed hierarchical model.
翻译:基于非参数最大似然估计(NPMLE)的经验贝叶斯 $g$-建模方法,在正态均值问题的大规模估计与推断中占据核心地位。然而,该方法在不确定性量化方面的理论保证仍较为稀缺。关键障碍在于NPMLE必然具有离散性,这导致后验置信集呈现离散特征,并产生缓慢的对数解卷积速率。我们通过引入层次高斯平滑层来克服这两个局限,该平滑层将混合分布限制为高斯位置混合。我们的光滑NPMLE继承了经典NPMLE的优良性质:可通过凸优化计算,并实现近乎参数化的去噪性能。此外,它达到多项式解卷积速率,在对应函数类上渐近达到极小极大最优。该过程还能以多项式速率收敛到真实后验分布的光滑后验估计。进一步,我们刻画了在期望长度意义下最优的边际覆盖集,构造了这些集合的插件估计量,并从覆盖概率和期望长度两方面为估计集合建立了理论保证。同时,我们将理论推广至模型误设定和异方差高斯观测的情形,并研究了所提层次模型的可辨识性。