Control barrier functions (CBF) have become popular as a safety filter to guarantee the safety of nonlinear dynamical systems for arbitrary inputs. However, it is difficult to construct functions that satisfy the CBF constraints for high relative degree systems with input constraints. To address these challenges, recent work has explored learning CBFs using neural networks via neural CBF (NCBF). However, such methods face difficulties when scaling to higher dimensional systems under input constraints. In this work, we first identify challenges that NCBFs face during training. Next, to address these challenges, we propose policy neural CBF (PNCBF), a method of constructing CBFs by learning the value function of a nominal policy, and show that the value function of the maximum-over-time cost is a CBF. We demonstrate the effectiveness of our method in simulation on a variety of systems ranging from toy linear systems to an F-16 jet with a 16-dimensional state space. Finally, we validate our approach on a two-agent quadcopter system on hardware under tight input constraints.
翻译:控制屏障函数(CBF)已成为一种流行的安全滤波器,用于保证非线性动态系统在任意输入下的安全性。然而,对于具有输入约束的高相对度系统,构造满足CBF约束的函数十分困难。为解决这些挑战,近期研究探索了通过神经网络学习CBF,即神经CBF(NCBF)。但此类方法在扩展到输入约束下的高维系统时面临困难。本文首先识别了NCBF在训练过程中面临的挑战。接着,为应对这些挑战,我们提出策略神经CBF(PNCBF)方法,通过学习名义策略的价值函数来构造CBF,并证明了时间累积代价的最大值函数即为CBF。我们在从简单线性系统到状态空间为16维的F-16喷气机等多种系统仿真中验证了本方法的有效性。最后,我们在硬件平台上针对两架四旋翼无人机系统在严格输入约束下进行了实际验证。