Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any agent, where its dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a parameterization rule for safety index to ensure the existence of safe control; (iii) a safety guarantee in terms of probabilistic forward invariance when the model is learned using the aforementioned dataset. Simulation results show that our framework guarantees almost zero safety violation on various continuous control tasks.
翻译:安全性是将强化学习(RL)应用于物理世界时面临的最大挑战之一。其核心难点在于,在缺乏白盒或黑盒动力学模型的情况下,难以确保RL智能体持续满足硬状态约束。本文提出一种集成模型学习与安全控制的框架,可在任意智能体上建立安全防护机制,其中动力学模型采用高斯过程进行学习。所提出的理论包含:(i)一种新颖的离线数据集构建方法,用于模型学习以最优方式满足安全需求;(ii)安全指标的参数化规则,确保安全控制的存在性;(iii)当使用前述数据集学习模型时,基于概率前向不变性的安全保障。仿真结果表明,该框架在各种连续控制任务中可保证近乎零的安全违规行为。