This paper introduces a sampling-based strategy synthesis algorithm for nondeterministic hybrid systems with complex continuous dynamics under temporal and reachability constraints. We view the evolution of the hybrid system as a two-player game, where the nondeterminism is an adversarial player whose objective is to prevent achieving temporal and reachability goals. The aim is to synthesize a winning strategy -- a reactive (robust) strategy that guarantees the satisfaction of the goals under all possible moves of the adversarial player. The approach is based on growing a (search) game-tree in the hybrid space by combining a sampling-based planning method with a novel bandit-based technique to select and improve on partial strategies. We provide conditions under which the algorithm is probabilistically complete, i.e., if a winning strategy exists, the algorithm will almost surely find it. The case studies and benchmark results show that the algorithm is general and consistently outperforms the state of the art.
翻译:本文提出了一种基于采样的策略综合算法,用于处理具有复杂连续动力学且受时间与可达性约束的非确定性混合系统。我们将混合系统的演化建模为一个双人博弈,其中非确定性行为由一个对抗性玩家控制,其目标是阻止系统实现时间与可达性目标。本文旨在合成一种获胜策略——一种反应式(鲁棒)策略,确保在所有可能的对抗性玩家动作下目标均能得到满足。该方法通过在混合空间中生长(搜索)博弈树实现:将基于采样的规划方法与一种新颖的基于置信区间上界的技术相结合,用于选择并改进局部策略。我们给出了算法概率完备的条件,即若存在获胜策略,算法几乎必然能找到该策略。案例研究与基准测试结果表明,该算法具备通用性,且持续优于现有最优方法。