Variational inference, such as the mean-field (MF) approximation, requires certain conjugacy structures for efficient computation. These can impose unnecessary restrictions on the viable prior distribution family and further constraints on the variational approximation family. In this work, we introduce a general computational framework to implement MF variational inference for Bayesian models, with or without latent variables, using the Wasserstein gradient flow (WGF), a modern mathematical technique for realizing a gradient flow over the space of probability measures. Theoretically, we analyze the algorithmic convergence of the proposed approaches, providing an explicit expression for the contraction factor. We also strengthen existing results on MF variational posterior concentration from a polynomial to an exponential contraction, by utilizing the fixed point equation of the time-discretized WGF. Computationally, we propose a new constraint-free function approximation method using neural networks to numerically realize our algorithm. This method is shown to be more precise and efficient than traditional particle approximation methods based on Langevin dynamics.
翻译:变分推断中的平均场近似需要特定的共轭结构以保证计算效率,这往往对先验分布族施加了不必要的限制,并进一步约束了变分近似族。本文提出了一种通用的计算框架,利用现代数学技术——Wasserstein梯度流,在概率测度空间上实现梯度流,从而对含隐变量或不含隐变量的贝叶斯模型执行平均场变分推断。理论上,我们分析了所提算法的收敛性,给出了收缩因子的显式表达式。通过利用时间离散化Wasserstein梯度流的定点方程,我们将已有的平均场变分后验收缩结果从多项式收缩提升至指数收缩。计算方面,我们提出了一种基于神经网络的免约束函数逼近方法以数值实现该算法。实验证明,与传统基于朗之万动力学的粒子近似方法相比,该方法具有更高的精度和效率。