Many autonomous systems face safety challenges, requiring robust closed-loop control to handle physical limitations and safety constraints. Real-world systems, like autonomous ships, encounter nonlinear dynamics and environmental disturbances. Reinforcement learning is increasingly used to adapt to complex scenarios, but standard frameworks ensuring safety and stability are lacking. Predictive Safety Filters (PSF) offer a promising solution, ensuring constraint satisfaction in learning-based control without explicit constraint handling. This modular approach allows using arbitrary control policies, with the safety filter optimizing proposed actions to meet physical and safety constraints. We apply this approach to marine navigation, combining RL with PSF on a simulated Cybership II model. The RL agent is trained on path following and collision avpodance, while the PSF monitors and modifies control actions for safety. Results demonstrate the PSF's effectiveness in maintaining safety without hindering the RL agent's learning rate and performance, evaluated against a standard RL agent without PSF.
翻译:许多自主系统面临安全挑战,需要鲁棒的闭环控制来处理物理限制和安全约束。如自主船舶等现实系统,会遭遇非线性动力学与环境扰动。强化学习被越来越多地应用于适应复杂场景,但缺乏确保安全性与稳定性的标准框架。预测性安全滤波器(PSF)提供了一种有前景的解决方案,能在不进行显式约束处理的情况下,确保基于学习的控制满足约束条件。这种模块化方法允许使用任意控制策略,安全滤波器通过优化提议动作来满足物理与安全约束。我们将此方法应用于海上导航,在模拟的Cybership II模型上结合强化学习与PSF。RL代理在路径跟踪与避碰任务上进行训练,而PSF则监控并修正控制动作以确保安全。结果表明,相较于未使用PSF的标准RL代理,PSF在维持安全性的同时,未阻碍RL代理的学习速率与性能。