This paper introduces a sampling-based strategy synthesis algorithm for nondeterministic hybrid systems with complex continuous dynamics under temporal and reachability constraints. We view the evolution of the hybrid system as a two-player game, where the nondeterminism is an adversarial player whose objective is to prevent achieving temporal and reachability goals. The aim is to synthesize a winning strategy -- a reactive (robust) strategy that guarantees the satisfaction of the goals under all possible moves of the adversarial player. The approach is based on growing a (search) game-tree in the hybrid space by combining a sampling-based planning method with a novel bandit-based technique to select and improve on partial strategies. We provide conditions under which the algorithm is probabilistically complete, i.e., if a winning strategy exists, the algorithm will almost surely find it. The case studies and benchmark results show that the algorithm is general and consistently outperforms the state of the art.
翻译:本文提出了一种针对具有复杂连续动力学特性的非确定性混合系统在时间与可达性约束下的采样式策略综合算法。我们将混合系统的演化视为双人博弈过程,其中非确定性因素构成对抗性玩家,其目标是阻止系统达成时间与可达性目标。本研究的核心在于综合出获胜策略——一种能保证在所有对抗性玩家可能行动下均能满足目标的反应式(鲁棒)策略。该方法通过在混合空间中构建(搜索)博弈树实现:将基于采样的规划方法与新型基于置信上界的技术相结合,用于选择并优化部分策略。我们给出了算法概率完备性的条件——即当获胜策略存在时,算法几乎必然能够找到该策略。案例研究与基准测试结果表明,该算法具有通用性,且持续优于现有最优方法。