数据驱动的概率程序不变量学习 (Data-Driven Invariant Learning for Probabilistic Programs)

Morgan and McIver's weakest pre-expectation framework is one of the most well-established methods for deductive verification of probabilistic programs. Roughly, the idea is to generalize binary state assertions to real-valued expectations, which can measure expected values of probabilistic program quantities. While loop-free programs can be analyzed by mechanically transforming expectations, verifying loops usually requires finding an invariant expectation, a difficult task. We propose a new view of invariant expectation synthesis as a regression problem: given an input state, predict the average value of the post-expectation in the output distribution. Guided by this perspective, we develop the first data-driven invariant synthesis method for probabilistic programs. Unlike prior work on probabilistic invariant inference, our approach can learn piecewise continuous invariants without relying on template expectations, and also works with black-box access to the program. We also develop a data-driven approach to learn sub-invariants from data, which can be used to upper- or lower-bound expected values. We implement our approaches and demonstrate their effectiveness on a variety of benchmarks from the probabilistic programming literature.

翻译：Morgan和McIver的最弱前置期望框架是概率程序演绎验证中最成熟的方法之一。其核心思想是将二元状态断言推广为实值期望，从而能够度量概率程序量的期望值。虽然无循环程序可通过机械转换期望进行分析，但验证循环通常需要寻找不变期望——这是一项困难的任务。我们提出将不变期望综合视为回归问题的新视角：给定输入状态，预测输出分布中后置期望的平均值。在此视角指导下，我们开发了首个数据驱动的概率程序不变量综合方法。与先前概率不变量推断研究不同，我们的方法无需依赖模板期望即可学习分段连续不变量，并且适用于程序黑盒访问场景。我们还开发了从数据中学习子不变量的数据驱动方法，该方法可用于对期望值进行上界或下界估计。我们实现了这些方法，并在概率程序文献中的各类基准测试中验证了其有效性。