For many multiagent control problems, neural networks (NNs) have enabled promising new capabilities. However, many of these systems lack formal guarantees (e.g., collision avoidance, robustness), which prevents leveraging these advances in safety-critical settings. While there is recent work on formal verification of NN-controlled systems, most existing techniques cannot handle scenarios with more than one agent. To address this research gap, this paper presents a backward reachability-based approach for verifying the collision avoidance properties of Multi-Agent Neural Feedback Loops (MA-NFLs). Given the dynamics models and trained control policies of each agent, the proposed algorithm computes relative backprojection sets by solving a series of Mixed Integer Linear Programs (MILPs) offline for each pair of agents. Our pair-wise approach is parallelizable and thus scales well with increasing number of agents, and we account for state measurement uncertainties, making it well aligned with real-world scenarios. Using those results, the agents can quickly check for collision avoidance online by solving low-dimensional Linear Programs (LPs). We demonstrate the proposed algorithm can verify collision-free properties of a MA-NFL with agents trained to imitate a collision avoidance algorithm (Reciprocal Velocity Obstacles). We further demonstrate the computational scalability of the approach on systems with up to 10 agents.
翻译:对于许多多智能体控制问题,神经网络(NNs)已带来极具前景的新能力。然而,这些系统大多缺乏形式化保证(如碰撞避免、鲁棒性),阻碍了这些进展在安全关键场景中的应用。尽管近期有关于神经网络控制系统形式化验证的研究,但现有技术大多无法处理包含两个以上智能体的场景。为填补这一研究空白,本文提出一种基于后向可达性的方法,用于验证多智能体神经反馈环路(MA-NFLs)的碰撞避免特性。该方法针对每个智能体的动力学模型及训练得到的控制策略,通过离线求解每对智能体间的混合整数线性规划(MILPs)序列,计算相对背投影集合。这种按对处理的方法具有可并行化特性,因此能随智能体数量增加良好扩展;同时考虑了状态测量不确定性,使其与真实场景高度契合。基于离线计算结果,各智能体可通过在线求解低维线性规划(LPs)快速检验碰撞避免性能。实验表明,该算法能验证由模仿碰撞避免算法(相互速度障碍法)训练的智能体组成的MA-NFL系统的无碰撞特性。我们进一步在包含多达10个智能体的系统上验证了该方法的计算可扩展性。