We consider imperative programs that involve both randomization and pure nondeterminism. The central question is how to find a strategy resolving the pure nondeterminism such that the so-obtained determinized program satisfies a given quantitative specification, i.e., bounds on expected outcomes such as the expected final value of a program variable or the probability to terminate in a given set of states. We show how memoryless and deterministic (MD) strategies can be obtained in a semi-automatic fashion using deductive verification techniques. For loop-free programs, the MD strategies resulting from our weakest precondition-style framework are correct by construction. This extends to loopy programs, provided the loops are equipped with suitable loop invariants - just like in program verification. We show how our technique relates to the well-studied problem of obtaining strategies in countably infinite Markov decision processes with reachability-reward objectives. Finally, we apply our technique to several case studies.
翻译:我们考虑同时涉及随机化和纯非确定性的命令式程序。核心问题在于如何找到一种解决纯非确定性的策略,使得由此得到的确定性程序满足给定的定量规范,即关于预期结果的边界条件,例如程序变量的预期终值或在给定状态集终止的概率。我们展示了如何利用演绎验证技术以半自动化方式获得无记忆确定性策略。对于无循环程序,基于最弱前置条件框架生成的MD策略在构造上即保证正确性。该结果可扩展至循环程序,前提是循环配备了合适的循环不变量——正如程序验证中的做法。我们展示了该技术与在具有可达性-奖励目标的可数无限马尔可夫决策过程中获取策略这一经典问题之间的关联。最后,我们将该技术应用于多个案例研究。