Control barrier functions (CBF) have become popular as a safety filter to guarantee the safety of nonlinear dynamical systems for arbitrary inputs. However, it is difficult to construct functions that satisfy the CBF constraints for high relative degree systems with input constraints. To address these challenges, recent work has explored learning CBFs using neural networks via neural CBF (NCBF). However, such methods face difficulties when scaling to higher dimensional systems under input constraints. In this work, we first identify challenges that NCBFs face during training. Next, to address these challenges, we propose policy neural CBF (PNCBF), a method of constructing CBFs by learning the value function of a nominal policy, and show that the value function of the maximum-over-time cost is a CBF. We demonstrate the effectiveness of our method in simulation on a variety of systems ranging from toy linear systems to an F-16 jet with a 16-dimensional state space. Finally, we validate our approach on a two-agent quadcopter system on hardware under tight input constraints.
翻译:控制屏障函数(CBF)作为一种安全滤波器,在保证非线性动力系统任意输入下的安全性方面得到了广泛应用。然而,对于具有高相对度且输入受限的系统,构建满足CBF约束的函数仍存在困难。为应对这些挑战,近期工作通过神经CBF(NCBF)探索了利用神经网络学习CBF的方法,但此类方法在输入受限条件下扩展到更高维系统时面临困难。本文首先识别了NCBF在训练过程中面临的挑战;其次,为解决这些问题,我们提出了策略神经CBF(PNCBF)方法——通过学习标称策略的值函数来构建CBF,并证明了最大-时间成本的值函数即为CBF。我们在从简单线性系统到16维状态空间的F-16喷气机等多样系统上,通过仿真验证了方法的有效性。最后,我们在严格输入约束下的双智能体四旋翼硬件平台上对方法进行了实物验证。