In this paper, we consider the problem of parametric empirical Bayes estimation of an i.i.d. prior in high-dimensional Bayesian linear regression, with random design. We obtain the asymptotic distribution of the variational Empirical Bayes (vEB) estimator, which approximately maximizes a variational lower bound of the intractable marginal likelihood. We characterize a sharp phase transition behavior for the vEB estimator -- namely that it is information theoretically optimal (in terms of limiting variance) up to $p=o(n^{2/3})$ while it suffers from a sub-optimal convergence rate in higher dimensions. In the first regime, i.e., when $p=o(n^{2/3})$, we show how the estimated prior can be calibrated to enable valid coordinate-wise and delocalized inference, both under the \emph{empirical Bayes posterior} and the oracle posterior. In the second regime, we propose a debiasing technique as a way to improve the performance of the vEB estimator beyond $p=o(n^{2/3})$. Extensive numerical experiments corroborate our theoretical findings.
翻译:本文研究了高维贝叶斯线性回归中独立同分布先验的参数化经验贝叶斯估计问题,其中设计矩阵为随机设计。我们获得了变分经验贝叶斯估计量的渐近分布,该估计量近似最大化难处理边缘似然函数的变分下界。我们刻画了vEB估计量的尖锐相变行为——即当 $p=o(n^{2/3})$ 时,该估计量在信息论意义下(以极限方差衡量)达到最优;而在更高维度时,其收敛速率会退化为次优。在第一区域(即 $p=o(n^{2/3})$ 时),我们展示了如何通过校准估计的先验分布,在经验贝叶斯后验和真实后验下实现有效的坐标推断与去局部化推断。在第二区域,我们提出了一种去偏技术,以提升vEB估计量在 $p=o(n^{2/3})$ 范围之外的性能。大量数值实验验证了我们的理论发现。