Control barrier functions (CBFs) enable guaranteed safe multi-agent navigation in the continuous domain. The resulting navigation performance, however, is highly sensitive to the underlying hyperparameters. Traditional approaches consider fixed CBFs (where parameters are tuned apriori), and hence, typically do not perform well in cluttered and highly dynamic environments: conservative parameter values can lead to inefficient agent trajectories, or even failure to reach goal positions, whereas aggressive parameter values can lead to infeasible controls. To overcome these issues, in this paper, we propose online CBFs, whereby hyperparameters are tuned in real-time, as a function of what agents perceive in their immediate neighborhood. Since the explicit relationship between CBFs and navigation performance is hard to model, we leverage reinforcement learning to learn CBF-tuning policies in a model-free manner. Because we parameterize the policies with graph neural networks (GNNs), we are able to synthesize decentralized agent controllers that adjust parameter values locally, varying the degree of conservative and aggressive behaviors across agents. Simulations as well as real-world experiments show that (i) online CBFs are capable of solving navigation scenarios that are infeasible for fixed CBFs, and (ii), that they improve navigation performance by adapting to other agents and changes in the environment.
翻译:控制障碍函数能够保证连续域中多智能体导航的安全性。然而,其导航性能对底层超参数高度敏感。传统方法采用固定控制障碍函数(参数需预先调整),因此在障碍密集且高度动态的环境中表现不佳:保守的参数值可能导致智能体轨迹效率低下,甚至无法到达目标位置,而激进参数值则可能导致控制不可行。为解决这些问题,本文提出在线控制障碍函数方法,根据智能体在近邻区域感知的信息实时调整超参数。由于控制障碍函数与导航性能之间的显式关系难以建模,我们采用强化学习以无模型方式学习控制障碍函数调参策略。通过使用图神经网络参数化策略,我们能够合成去中心化的智能体控制器,使其在局部调整参数值,在不同智能体间调节保守与激进行为的程度。仿真与真实实验表明:(i)在线控制障碍函数能够解决固定控制障碍函数无法处理的导航场景;(ii)通过适应其他智能体与环境变化,显著提升导航性能。