Importance weighted variational inference (VI) approximates densities known up to a normalizing constant by optimizing bounds that tighten with the number of Monte Carlo samples $N$. Standard optimization relies on reparameterized gradient estimators, which are well-studied theoretically yet restrict both the choice of the data-generating process and the variational approximation. While REINFORCE gradient estimators do not suffer from such restrictions, they lack rigorous theoretical justification. In this paper, we provide the first comprehensive analysis of REINFORCE gradient estimators in importance weighted VI, leveraging this theoretical foundation to diagnose and resolve fundamental deficiencies in current state-of-the-art estimators. Specifically, we introduce and examine a generalized family of variational inference for Monte Carlo objectives (VIMCO) gradient estimators. We prove that state-of-the-art VIMCO gradient estimators exhibit a vanishing signal-to-noise ratio (SNR) as $N$ increases, which prevents effective optimization. To overcome this issue, we propose the novel VIMCO-$\star$ gradient estimator and show that it averts the SNR collapse of existing VIMCO gradient estimators by achieving a $\sqrt{N}$ SNR scaling instead. We demonstrate its superior empirical performance compared to current VIMCO implementations in challenging settings where reparameterized gradients are typically unavailable.
翻译:重要性加权变分推断(VI)通过优化随蒙特卡洛样本数 $N$ 增加而收紧的界,来近似已知归一化常数的概率密度。标准优化依赖于重参数化梯度估计器,这类方法在理论上已有深入研究,但同时对数据生成过程和变分近似的选择施加了限制。虽然REINFORCE梯度估计器不受此类限制,但其缺乏严格的理论依据。本文首次对重要性加权VI中的REINFORCE梯度估计器进行了全面分析,并基于此理论框架诊断并解决了当前最先进估计器的根本缺陷。具体而言,我们提出并研究了一类广义的蒙特卡洛目标变分推断(VIMCO)梯度估计器族。我们证明,当前最先进的VIMCO梯度估计器在 $N$ 增大时会出现信噪比(SNR)消失的现象,从而阻碍有效优化。为克服此问题,我们提出了新颖的VIMCO-$\star$ 梯度估计器,并证明其通过实现 $\sqrt{N}$ 量级的SNR缩放,避免了现有VIMCO梯度估计器的SNR崩溃问题。我们在重参数化梯度通常不可用的复杂场景中,验证了其相较于当前VIMCO实现方案的优越实证性能。