Predictive safety filters (PSFs) leverage model predictive control to enforce constraint satisfaction during deep reinforcement learning (RL) exploration, yet their reliance on first-principles models or Gaussian processes limits scalability and broader applicability. Meanwhile, model-based RL (MBRL) methods routinely employ probabilistic ensemble (PE) neural networks to capture complex, high-dimensional dynamics from data with minimal prior knowledge. However, existing attempts to integrate PEs into PSFs lack rigorous uncertainty quantification. We introduce the Uncertainty-Aware Predictive Safety Filter (UPSi), a PSF that provides rigorous safety predictions using PE dynamics models by formulating future outcomes as reachable sets. UPSi introduces an explicit certainty constraint that prevents model exploitation and integrates seamlessly into common MBRL frameworks. We evaluate UPSi within Dyna-style MBRL on standard safe RL benchmarks and report substantial improvements in exploration safety over prior neural network PSFs while maintaining performance on par with standard MBRL. UPSi bridges the gap between the scalability and generality of modern MBRL and the safety guarantees of predictive safety filters.
翻译:预测安全滤波器(PSF)利用模型预测控制在深度强化学习探索过程中强制执行约束满足,但这类方法依赖第一性原理模型或高斯过程,限制了其可扩展性和更广泛适用性。与此同时,基于模型的强化学习(MBRL)方法通常采用概率集成神经网络,从数据中捕捉复杂高维动力学,且所需先验知识极少。然而,现有将概率集成模型整合到预测安全滤波器的尝试缺乏严格的量化不确定性分析。我们提出不确定性感知预测安全滤波器(UPSi),这是一种利用概率集成动力学模型通过将未来结果构建为可达集来提供严格安全预测的预测安全滤波器。UPSi引入显式的确定性约束,可防止模型过度利用,并无缝集成到常见MBRL框架中。我们在标准安全强化学习基准上对Dyna式MBRL中的UPSi进行评估,结果显示相较于先前的神经网络预测安全滤波器,该方法在探索安全性上取得显著提升,同时维持与标准MBRL相当的性能。UPSi弥合了现代MBRL的可扩展性与普适性同预测安全滤波器安全性保证之间的差距。