Empirical Bayes provides a powerful approach to learning and adapting to latent structure in data. Theory and algorithms for empirical Bayes have a rich literature for sequence models, but are less understood in settings where latent variables and data interact through more complex designs. In this work, we study empirical Bayes estimation of an i.i.d. prior in Bayesian linear models, via the nonparametric maximum likelihood estimator (NPMLE). We introduce and study a system of gradient flow equations for optimizing the marginal log-likelihood, jointly over the prior and posterior measures in its Gibbs variational representation using a smoothed reparametrization of the regression coefficients. A diffusion-based implementation yields a Langevin dynamics MCEM algorithm, where the prior law evolves continuously over time to optimize a sequence-model log-likelihood defined by the coordinates of the current Langevin iterate. We show consistency of the NPMLE as $n, p \rightarrow \infty$ under mild conditions, including settings of random sub-Gaussian designs when $n \asymp p$. In high noise, we prove a uniform log-Sobolev inequality for the mixing of Langevin dynamics, for possibly misspecified priors and non-log-concave posteriors. We then establish polynomial-time convergence of the joint gradient flow to a near-NPMLE if the marginal negative log-likelihood is convex in a sub-level set of the initialization.
翻译:经验贝叶斯为学习并适应数据中的潜在结构提供了一种强大的方法。关于序列模型中经验贝叶斯的理论与算法已有丰富文献,但在潜变量与数据通过更复杂设计相互作用的情境下,其理解仍相对不足。本文研究贝叶斯线性模型中独立同分布先验的经验贝叶斯估计,采用非参数最大似然估计(NPMLE)方法。我们引入并探究一组梯度流方程,用于优化边际对数似然:该方程在吉布斯变分表示中联合优化先验和后验测度,并通过回归系数的平滑重参数化实现。基于扩散的实现方法产生了一种朗之万动力学MCEM算法,其中先验分布随时间连续演化,以优化由当前朗之万迭代坐标定义的序列模型对数似然。我们证明,在温和条件下,当$n, p \rightarrow \infty$时(包括$n \asymp p$的随机亚高斯设计设定),NPMLE具有一致性。在高噪声环境下,我们针对朗之万动力学的混合性证明了一个一致的对数索博列夫不等式,该不等式适用于可能错误指定的先验和非对数凹的后验。随后,我们证明:若边际负对数似然在初始化的子水平集中为凸函数,则联合梯度流可在多项式时间内收敛到近似NPMLE。