Neural Lyapunov and barrier certificates have recently been used as powerful tools for verifying the safety and stability properties of deep reinforcement learning (RL) controllers. However, existing methods offer guarantees only under fixed ideal unperturbed dynamics, limiting their reliability in real-world applications where dynamics may deviate due to uncertainties. In this work, we study the problem of synthesizing \emph{robust neural Lyapunov barrier certificates} that maintain their guarantees under perturbations in system dynamics. We formally define a robust Lyapunov barrier function and specify sufficient conditions based on Lipschitz continuity that ensure robustness against bounded perturbations. We propose practical training objectives that enforce these conditions via adversarial training, Lipschitz neighborhood bound, and global Lipschitz regularization. We validate our approach in two practically relevant environments, Inverted Pendulum and 2D Docking. The former is a widely studied benchmark, while the latter is a safety-critical task in autonomous systems. We show that our methods significantly improve both certified robustness bounds (up to $4.6$ times) and empirical success rates under strong perturbations (up to $2.4$ times) compared to the baseline. Our results demonstrate effectiveness of training robust neural certificates for safe RL under perturbations in dynamics.
翻译:神经李雅普诺夫与屏障证书近来已成为验证深度强化学习控制器安全性与稳定性属性的有力工具。然而,现有方法仅在固定、理想、无扰动的动力学模型下提供保证,这限制了其在现实世界应用中的可靠性,因为实际动力学可能因不确定性而发生偏离。在本工作中,我们研究了综合**鲁棒神经李雅普诺夫屏障证书**的问题,该证书能在系统动力学存在扰动的情况下维持其保证。我们形式化地定义了鲁棒李雅普诺夫屏障函数,并基于Lipschitz连续性指定了确保对有限扰动具有鲁棒性的充分条件。我们提出了实用的训练目标,通过对抗训练、Lipschitz邻域边界和全局Lipschitz正则化来强制执行这些条件。我们在两个具有实际相关性的环境中验证了我们的方法:倒立摆和二维对接。前者是一个被广泛研究的基准,而后者是自主系统中的一项安全关键任务。实验表明,与基线方法相比,我们的方法在强扰动下显著提升了认证鲁棒性边界(高达$4.6$倍)和经验成功率(高达$2.4$倍)。我们的结果证明了在动力学存在扰动的情况下,训练鲁棒神经证书对于安全强化学习的有效性。