This article considers Bayesian model selection via mean-field (MF) variational approximation. Towards this goal, we study the non-asymptotic properties of MF inference under the Bayesian framework that allows latent variables and model mis-specification. Concretely, we show a Bernstein von-Mises (BvM) theorem for the variational distribution from MF under possible model mis-specification, which implies the distributional convergence of MF variational approximation to a normal distribution centering at the maximal likelihood estimator (within the specified model). Motivated by the BvM theorem, we propose a model selection criterion using the evidence lower bound (ELBO), and demonstrate that the model selected by ELBO tends to asymptotically agree with the one selected by the commonly used Bayesian information criterion (BIC) as sample size tends to infinity. Comparing to BIC, ELBO tends to incur smaller approximation error to the log-marginal likelihood (a.k.a. model evidence) due to a better dimension dependence and full incorporation of the prior information. Moreover, we show the geometric convergence of the coordinate ascent variational inference (CAVI) algorithm under the parametric model framework, which provides a practical guidance on how many iterations one typically needs to run when approximating the ELBO. These findings demonstrate that variational inference is capable of providing a computationally efficient alternative to conventional approaches in tasks beyond obtaining point estimates, which is also empirically demonstrated by our extensive numerical experiments.
翻译:本文研究了基于均值场变分近似的贝叶斯模型选择问题。为此,我们分析了在允许潜变量和模型误设的贝叶斯框架下均值场推断的非渐近性质。具体而言,我们证明了在可能存在模型误设的情况下,基于均值场的变分分布满足伯恩斯坦-冯·米塞斯定理,这意味着均值场变分近似在分布上收敛于以最大似然估计(在指定模型内)为中心的正态分布。受该定理启发,我们提出了一种基于证据下界(ELBO)的模型选择准则,并证明随着样本量趋于无穷,由ELBO选择的模型在渐近意义上与广泛使用的贝叶斯信息准则(BIC)选择的模型一致。与BIC相比,ELBO在对数边际似然(即模型证据)的近似误差方面更小,这得益于其更好的维度依赖性和对先验信息的充分利用。此外,我们证明了在参数模型框架下坐标上升变分推断(CAVI)算法的几何收敛性,这为实际近似ELBO时所需的迭代次数提供了实用指导。这些发现表明,变分推断在超越点估计的任务中能够提供计算高效的传统方法替代方案,这一结论也得到我们广泛数值实验的实证支持。