We formally verify an algorithm for approximate policy iteration on Factored Markov Decision Processes using the interactive theorem prover Isabelle/HOL. Next, we show how the formalized algorithm can be refined to an executable, verified implementation. The implementation is evaluated on benchmark problems to show its practicability. As part of the refinement, we develop verified software to certify Linear Programming solutions. The algorithm builds on a diverse library of formalized mathematics and pushes existing methodologies for interactive theorem provers to the limits. We discuss the process of the verification project and the modifications to the algorithm needed for formal verification.
翻译:我们使用交互式定理证明器Isabelle/HOL,对基于因子化马尔可夫决策过程的近似策略迭代算法进行了形式化验证。随后,我们展示了如何将形式化算法精化为可执行的验证实现。通过在基准问题上对该实现进行评估,我们证明了其实用性。作为精化过程的一部分,我们开发了用于验证线性规划解正确性的认证软件。该算法建立在多样化的形式化数学库基础上,并将交互式定理证明器的现有方法论推向了极限。我们讨论了该验证项目的实施过程,以及为满足形式化验证要求而对算法所做的必要修改。