Statistical model checking estimates probabilities and expectations of interest in probabilistic system models by using random simulations. Its results come with statistical guarantees. However, many tools use unsound statistical methods that produce incorrect results more often than they claim. In this paper, we provide a comprehensive overview of tools and their correctness, as well as of sound methods available for estimating probabilities from the literature. For expected rewards, we investigate how to bound the path reward distribution to apply sound statistical methods for bounded distributions, of which we recommend the Dvoretzky-Kiefer-Wolfowitz inequality that has not been used in SMC so far. We prove that even reachability rewards can be bounded in theory, and formalise the concept of limit-PAC procedures for a practical solution. The 'modes' SMC tool implements our methods and recommendations, which we use to experimentally confirm our results.
翻译:统计模型检验通过随机仿真估计概率系统模型中关注的概率与期望值,其结果具有统计保证。然而,许多工具采用不可靠的统计方法,导致结果错误的频率高于其声称的水平。本文系统梳理了现有工具及其正确性,并综述了文献中可用于概率估计的可靠方法。针对期望奖励问题,我们研究了如何界定路径奖励分布以应用适用于有界分布的可靠统计方法,其中推荐使用迄今未在统计模型检验中应用的Dvoretzky-Kiefer-Wolfowitz不等式。我们证明了可达性奖励在理论上亦可界定,并形式化了适用于实际解决方案的limit-PAC过程概念。'modes'统计模型检验工具实现了我们的方法与建议,我们通过实验验证了所得结论。