Importance sampling is a popular technique in Bayesian inference: by reweighting samples drawn from a proposal distribution we are able to obtain samples and moment estimates from a Bayesian posterior over latent variables. Recent work, however, indicates that importance sampling scales poorly -- in order to accurately approximate the true posterior, the required number of importance samples grows is exponential in the number of latent variables [Chatterjee and Diaconis, 2018]. Massively parallel importance sampling works around this issue by drawing $K$ samples for each of the $n$ latent variables and reasoning about all $K^n$ combinations of latent samples. In principle, we can reason efficiently over $K^n$ combinations of samples by exploiting conditional independencies in the generative model. However, in practice this requires complex algorithms that traverse backwards through the graphical model, and we need separate backward traversals for each computation (posterior expectations, marginals and samples). Our contribution is to exploit the source term trick from physics to entirely avoid the need to hand-write backward traversals. Instead, we demonstrate how to simply and easily compute all the required quantities -- posterior expectations, marginals and samples -- by differentiating through a slightly modified marginal likelihood estimator.
翻译:重要性采样是贝叶斯推断中的常用技术:通过对从提议分布中抽取的样本进行重新加权,我们可以从潜变量的贝叶斯后验分布中获得样本和矩估计。然而,近期研究表明重要性采样的扩展性较差——为准确逼近真实后验分布,所需的重要性样本数量随潜变量数量呈指数级增长[Chatterjee and Diaconis, 2018]。大规模并行重要性采样通过为每个潜变量抽取$K$个样本,并对所有$K^n$种潜变量样本组合进行推断,从而规避了这一问题。原则上,我们可以通过利用生成模型中的条件独立性,高效地对$K^n$种样本组合进行推断。但在实践中,这需要设计复杂的算法以逆向遍历图模型,且每次计算(后验期望、边缘分布和样本)都需要独立的逆向遍历过程。我们的贡献在于利用物理学中的源项技巧,完全避免了手动编写逆向遍历算法的需求。相反,我们证明了如何通过对一个稍加修改的边缘似然估计量进行微分,即可简单便捷地计算出所有所需量——后验期望、边缘分布与样本。