Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models. In this paper, we propose a novel VI method that extends the naive mean field via entropic regularization, referred to as $\Xi$-variational inference ($\Xi$-VI). $\Xi$-VI has a close connection to the entropic optimal transport problem and benefits from the computationally efficient Sinkhorn algorithm. We show that $\Xi$-variational posteriors effectively recover the true posterior dependency, where the dependence is downweighted by the regularization parameter. We analyze the role of dimensionality of the parameter space on the accuracy of $\Xi$-variational approximation and how it affects computational considerations, providing a rough characterization of the statistical-computational trade-off in $\Xi$-VI. We also investigate the frequentist properties of $\Xi$-VI and establish results on consistency, asymptotic normality, high-dimensional asymptotics, and algorithmic stability. We provide sufficient criteria for achieving polynomial-time approximate inference using the method. Finally, we demonstrate the practical advantage of $\Xi$-VI over mean-field variational inference on simulated and real data.
翻译:变分推断(VI)已成为高维贝叶斯模型中近似推断的主流方法。本文提出一种新颖的变分推断方法,通过熵正则化扩展朴素均值场,称为 $\Xi$-变分推断($\Xi$-VI)。$\Xi$-VI 与熵最优传输问题存在紧密联系,并能受益于计算高效的 Sinkhorn 算法。我们证明 $\Xi$-变分后验能有效恢复真实后验的依赖结构,其中依赖强度由正则化参数进行加权调节。我们分析了参数空间维度对 $\Xi$-变分近似精度的影响及其对计算复杂度的作用,从而对 $\Xi$-VI 中的统计-计算权衡给出初步刻画。我们还研究了 $\Xi$-VI 的频率学派性质,建立了关于相合性、渐近正态性、高维渐近性质及算法稳定性的理论结果。我们给出了使用该方法实现多项式时间近似推断的充分判据。最后,通过模拟数据与真实数据实验,我们证明了 $\Xi$-VI 相较于均值场变分推断的实际优势。