Computing the posterior distribution of a probabilistic program is a hard task for which no one-fit-for-all solution exists. We propose Gaussian Semantics, which approximates the exact probabilistic semantics of a bounded program by means of Gaussian mixtures. It is parametrized by a map that associates each program location with the moment order to be matched in the approximation. We provide two main contributions. The first is a universal approximation theorem stating that, under mild conditions, Gaussian Semantics can approximate the exact semantics arbitrarily closely. The second is an approximation that matches up to second-order moments analytically in face of the generally difficult problem of matching moments of Gaussian mixtures with arbitrary moment order. We test our second-order Gaussian approximation (SOGA) on a number of case studies from the literature. We show that it can provide accurate estimates in models not supported by other approximation methods or when exact symbolic techniques fail because of complex expressions or non-simplified integrals. On two notable classes of problems, namely collaborative filtering and programs involving mixtures of continuous and discrete distributions, we show that SOGA significantly outperforms alternative techniques in terms of accuracy and computational time.
翻译:计算概率程序的后验分布是一项困难的任务,目前不存在通用的解决方案。我们提出高斯语义学,该方法通过高斯混合来近似有界程序的精确概率语义。该方法由一个映射参数化,该映射将每个程序位置与近似中需要匹配的矩阶数相关联。我们提供了两个主要贡献:第一是通用近似定理,该定理表明在温和条件下,高斯语义学可以任意接近地逼近精确语义;第二是针对高斯混合的矩匹配这一普遍难题(任意矩阶数),提出的一种解析匹配到二阶矩的近似方法。我们在文献中的多个案例研究上测试了我们的二阶高斯近似(SOGA),结果表明,对于其他近似方法无法支持的模型,或因复杂表达式或非简化积分导致精确符号技术失效的情况,该方法都能提供精确估计。在两类显著问题——协同过滤及涉及连续与离散分布混合的程序——上,我们证明SOGA在准确性和计算时间方面均显著优于替代技术。