Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any agent, where its dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a parameterization rule for safety index to ensure the existence of safe control; (iii) a safety guarantee in terms of probabilistic forward invariance when the model is learned using the aforementioned dataset. Simulation results show that our framework guarantees almost zero safety violation on various continuous control tasks.
翻译:安全性是将强化学习应用于物理世界时面临的最大挑战之一。核心难点在于,在没有白箱或黑箱动力学模型的情况下,确保强化学习智能体持续满足硬状态约束。本文提出了一种集成模型学习与安全控制的框架,对任意智能体实施安全防护,其中动力学模型通过高斯过程学习得到。所提出的理论提供了:(i) 一种用于模型学习的离线数据集构建新方法,能最优地满足安全性需求;(ii) 安全指标的参数化规则,确保安全控制的存在性;(iii) 在基于前述数据集学习模型时,以概率性前向不变性形式给出的安全性保证。仿真结果表明,我们的框架在各种连续控制任务中几乎实现了零安全违规。