Semi-implicit variational inference (SIVI) enriches the expressiveness of variational families by utilizing a kernel and a mixing distribution to hierarchically define the variational distribution. Existing SIVI methods parameterize the mixing distribution using implicit distributions, leading to intractable variational densities. As a result, directly maximizing the evidence lower bound (ELBO) is not possible and so, they resort to either: optimizing bounds on the ELBO, employing costly inner-loop Markov chain Monte Carlo runs, or solving minimax objectives. In this paper, we propose a novel method for SIVI called Particle Variational Inference (PVI) which employs empirical measures to approximate the optimal mixing distributions characterized as the minimizer of a natural free energy functional via a particle approximation of an Euclidean--Wasserstein gradient flow. This approach means that, unlike prior works, PVI can directly optimize the ELBO; furthermore, it makes no parametric assumption about the mixing distribution. Our empirical results demonstrate that PVI performs favourably against other SIVI methods across various tasks. Moreover, we provide a theoretical analysis of the behaviour of the gradient flow of a related free energy functional: establishing the existence and uniqueness of solutions as well as propagation of chaos results.
翻译:半隐式变分推断(SIVI)通过利用核函数与混合分布以分层方式定义变分分布,从而增强了变分族的表达能力。现有的SIVI方法采用隐式分布对混合分布进行参数化,导致变分密度难以处理。因此,直接最大化证据下界(ELBO)不可行,现有方法只能转而采用以下策略之一:优化ELBO的边界、使用计算代价高昂的内层马尔可夫链蒙特卡洛运行,或求解极小极大目标。本文提出一种名为粒子变分推断(PVI)的新型SIVI方法,该方法采用经验测度来逼近最优混合分布——该分布通过欧几里得-瓦瑟斯坦梯度流的粒子近似,被刻画为一个自然自由能泛函的极小化子。这一方法意味着,与先前工作不同,PVI能够直接优化ELBO;此外,它对混合分布不做任何参数化假设。我们的实证结果表明,PVI在多种任务中均优于其他SIVI方法。此外,我们对相关自由能泛函梯度流的行为进行了理论分析:建立了解的存在唯一性以及混沌传播结果。