Semi-implicit variational inference (SIVI) enriches the expressiveness of variational families by utilizing a kernel and a mixing distribution to hierarchically define the variational distribution. Existing SIVI methods parameterize the mixing distribution using implicit distributions, leading to intractable variational densities. As a result, directly maximizing the evidence lower bound (ELBO) is not possible, so they resort to one of the following: optimizing bounds on the ELBO, employing costly inner-loop Markov chain Monte Carlo runs, or solving minimax objectives. In this paper, we propose a novel method for SIVI called Particle Variational Inference (PVI) which employs empirical measures to approximate the optimal mixing distributions characterized as the minimizer of a free energy functional. PVI arises naturally as a particle approximation of a Euclidean--Wasserstein gradient flow and, unlike prior works, it directly optimizes the ELBO whilst making no parametric assumption about the mixing distribution. Our empirical results demonstrate that PVI performs favourably compared to other SIVI methods across various tasks. Moreover, we provide a theoretical analysis of the behaviour of the gradient flow of a related free energy functional: establishing the existence and uniqueness of solutions as well as propagation of chaos results.
翻译:半隐式变分推断(SIVI)通过使用核函数与混合分布进行层次化变分分布定义,从而增强了变分族的表达能力。现有的SIVI方法采用隐式分布对混合分布进行参数化,导致变分密度难以处理。因此,直接最大化证据下界(ELBO)不可行,现有方法只能采取以下策略之一:优化ELBO的边界、采用计算代价高昂的内层马尔可夫链蒙特卡洛循环,或求解极小极大目标。本文提出一种名为粒子变分推断(PVI)的新型SIVI方法,该方法采用经验测度来逼近以自由能泛函极小值刻画的混合分布。PVI自然地表现为欧几里得-瓦瑟斯坦梯度流的粒子近似,与先前工作不同,它直接优化ELBO且不对混合分布作任何参数化假设。实验结果表明,PVI在多种任务中均优于其他SIVI方法。此外,我们对相关自由能泛函梯度流的行为进行了理论分析:建立了解的存在唯一性以及混沌传播结果。