A fundamental computational task in probabilistic programming is to infer a program's output (posterior) distribution from a given initial (prior) distribution. This problem is challenging, especially for expressive languages that feature loops or unbounded recursion. While most of the existing literature focuses on statistical approximation, in this paper we address the problem of mathematically exact inference. To achieve this for programs with loops, we rely on a relatively underexplored type of probabilistic loop invariant, which is linked to a loop's so-called occupation measure. The occupation measure associates program states with their expected number of visits, given the initial distribution. Based on this, we derive the notion of an occupation invariant. Such invariants are essentially dual to probabilistic martingales, the predominant technique for formal probabilistic loop analysis in the literature. A key feature of occupation invariants is that they can take the initial distribution into account and often yield a proof of positive almost sure termination as a by-product. Finally, we present an automatic, template-based invariant synthesis approach for occupation invariants by encoding them as generating functions. The approach is implemented and evaluated on a set of benchmarks.
翻译:概率编程中的一个基本计算任务是从给定的初始(先验)分布推断程序的输出(后验)分布。这一问题具有挑战性,尤其是对于包含循环或无界递归的表达性语言。现有文献大多关注统计近似方法,而本文则致力于数学上精确的推断问题。为实现对含循环程序的精确推断,我们依赖一类相对未被充分探索的概率循环不变量,该不变量与循环的所谓占位测度相关联。占位测度将程序状态与其在给定初始分布下的期望访问次数联系起来。基于此,我们推导出占位不变量的概念。这类不变量本质上与概率鞅(文献中形式化概率循环分析的主流技术)形成对偶关系。占位不变量的一个关键特性在于其能够考虑初始分布,并常常作为副产品产生正几乎必然终止性的证明。最后,我们提出一种基于模板的自动不变量综合方法,通过将占位不变量编码为生成函数来实现。该方法已实现并在基准测试集上进行了评估。