Safety filters have been shown to be effective tools to ensure the safety of control systems with unsafe nominal policies. To address scalability challenges in traditional synthesis methods, learning-based approaches have been proposed for designing safety filters for systems with high-dimensional state and control spaces. However, the inevitable errors in the decisions of these models raise concerns about their reliability and the safety guarantees they offer. This paper presents Adaptive Conformal Filtering (ACoFi), a method that combines learned Hamilton-Jacobi reachability-based safety filters with adaptive conformal inference. Under ACoFi, the filter dynamically adjusts its switching criteria based on the observed errors in its predictions of the safety of actions. The range of possible safety values of the nominal policy's output is used to quantify uncertainty in safety assessment. The filter switches from the nominal policy to the learned safe one when that range suggests it might be unsafe. We show that ACoFi guarantees that the rate of incorrectly quantifying uncertainty in the predicted safety of the nominal policy is asymptotically upper bounded by a user-defined parameter. This gives a soft safety guarantee rather than a hard safety guarantee. We evaluate ACoFi in a Dubins car simulation and a Safety Gymnasium environment, empirically demonstrating that it significantly outperforms the baseline method that uses a fixed switching threshold by achieving higher learned safety values and fewer safety violations, especially in out-of-distribution scenarios.
翻译:安全过滤器已被证明是确保采用不安全标称策略的控制系统安全性的有效工具。为应对传统合成方法中的可扩展性挑战,研究者提出了基于学习的方法来为具有高维状态和控制空间的系统设计安全过滤器。然而,这些模型决策中的必然误差引发了对其可靠性及所提供安全保证的担忧。本文提出自适应共形过滤(ACoFi)方法,它将基于学习的Hamilton-Jacobi可达性安全过滤器与自适应共形推断相结合。在ACoFi机制下,过滤器会根据其对动作安全性预测中观测到的误差动态调整切换准则。通过标称策略输出可能的安全值范围来量化安全评估中的不确定性。当该范围表明标称策略可能不安全时,过滤器会从标称策略切换至学习到的安全策略。我们证明ACoFi能确保对标称策略预测安全性进行不确定性量化的错误率在渐近意义上被用户定义参数上界约束。这提供了软安全保证而非硬安全保证。我们在Dubins车辆仿真和Safety Gymnasium环境中评估了ACoFi,实验证明其通过实现更高的学习安全值和更少的安全违规,显著优于使用固定切换阈值的基线方法,尤其在分布外场景中表现突出。