Semi-implicit variational inference (SIVI) enhances the expressiveness of variational families through hierarchical semi-implicit distributions, but the intractability of their densities makes standard ELBO-based optimization biased. Recent score-matching approaches to SIVI (SIVI-SM) address this issue via a minimax formulation, at the expense of an additional lower-level optimization problem. In this paper, we propose kernel semi-implicit variational inference (KSIVI), a principled and tractable alternative that eliminates the lower-level optimization by leveraging kernel methods. We show that when optimizing over a reproducing kernel Hilbert space, the lower-level problem admits an explicit solution, reducing the objective to the kernel Stein discrepancy (KSD). Exploiting the hierarchical structure of semi-implicit distributions, the resulting KSD objective can be efficiently optimized using stochastic gradient methods. We establish optimization guarantees via variance bounds on Monte Carlo gradient estimators and derive statistical generalization bounds of order $\tilde{\mathcal{O}}(1/\sqrt{n})$. We further introduce a multi-layer hierarchical extension that improves expressiveness while preserving tractability. Empirical results on synthetic and real-world Bayesian inference tasks demonstrate the effectiveness of KSIVI.
翻译:半隐式变分推断(SIVI)通过分层半隐式分布增强了变分族的表达能力,但其密度函数的难处理性导致基于标准证据下界(ELBO)的优化存在偏差。近期基于分数匹配的SIVI方法(SIVI-SM)通过极小极大化形式解决了这一问题,但代价是引入了额外的下层优化问题。本文提出核半隐式变分推断(KSIVI),这是一种基于原理且易于处理的替代方案,通过利用核方法消除了下层优化。我们证明,当在再生核希尔伯特空间中进行优化时,下层问题存在显式解,从而将目标函数简化为核斯坦因差异(KSD)。利用半隐式分布的分层结构,所得的KSD目标函数可通过随机梯度方法进行高效优化。我们通过蒙特卡洛梯度估计量的方差界建立了优化保证,并推导出阶为$\tilde{\mathcal{O}}(1/\sqrt{n})$的统计泛化界。我们进一步引入了多层分层扩展,在保持可处理性的同时提升了表达能力。在合成及现实世界贝叶斯推断任务上的实证结果验证了KSIVI的有效性。