Control barrier functions (CBFs) have become popular as a safety filter to guarantee the safety of nonlinear dynamical systems for arbitrary inputs. However, it is difficult to construct functions that satisfy the CBF constraints for high relative degree systems with input constraints. To address these challenges, recent work has explored learning CBFs using neural networks via neural CBFs (NCBFs). However, such methods face difficulties when scaling to higher dimensional systems under input constraints. In this work, we first identify challenges that NCBFs face during training. Next, to address these challenges, we propose policy neural CBFs (PNCBFs), a method of constructing CBFs by learning the value function of a nominal policy, and show that the value function of the maximum-over-time cost is a CBF. We demonstrate the effectiveness of our method in simulation on a variety of systems ranging from toy linear systems to an F-16 jet with a 16-dimensional state space. Finally, we validate our approach on a two-agent quadcopter system on hardware under tight input constraints.
翻译:控制屏障函数(CBFs)已成为一种流行的安全滤波器,用于保证任意输入下非线性动力系统的安全性。然而,对于具有输入约束的高相对度系统,构造满足CBF约束的函数十分困难。为应对这些挑战,近期研究探索了通过神经CBF(NCBF)利用神经网络学习CBF的方法。然而,此类方法在将输入约束下的高维系统扩展时面临困难。本文首先识别了NCBF在训练过程中面临的挑战;其次,为应对这些挑战,我们提出了策略神经CBF(PNCBF)方法——通过学习名义策略的价值函数来构造CBF,并证明了时间上最大代价的价值函数即为一个CBF。我们在从玩具线性系统到具有16维状态空间的F-16喷气式飞机等多种系统上通过仿真验证了方法的有效性。最后,在严格的输入约束条件下,我们在双智能体四旋翼系统的硬件平台上验证了所提方法。