AI systems empowered by reinforcement learning (RL) algorithms harbor the immense potential to catalyze societal advancement, yet their deployment is often impeded by significant safety concerns. Particularly in safety-critical applications, researchers have raised concerns about unintended harms or unsafe behaviors of unaligned RL agents. The philosophy of safe reinforcement learning (SafeRL) is to align RL agents with harmless intentions and safe behavioral patterns. In SafeRL, agents learn to develop optimal policies by receiving feedback from the environment, while also fulfilling the requirement of minimizing the risk of unintended harm or unsafe behavior. However, due to the intricate nature of SafeRL algorithm implementation, combining methodologies across various domains presents a formidable challenge. This had led to an absence of a cohesive and efficacious learning framework within the contemporary SafeRL research milieu. In this work, we introduce a foundational framework designed to expedite SafeRL research endeavors. Our comprehensive framework encompasses an array of algorithms spanning different RL domains and places heavy emphasis on safety elements. Our efforts are to make the SafeRL-related research process more streamlined and efficient, therefore facilitating further research in AI safety. Our project is released at: https://github.com/PKU-Alignment/omnisafe.
翻译:基于强化学习算法的AI系统蕴含着推动社会进步的巨大潜力,但其部署常因显著的安全隐患而受阻。尤其在安全关键型应用中,研究者对未对齐的强化学习智能体可能产生的无意识危害或危险行为表示担忧。安全强化学习(SafeRL)的本质理念在于使强化学习智能体与无害意图及安全行为模式对齐。在SafeRL中,智能体通过接收环境反馈学习最优策略,同时需满足最小化意外伤害或不安全行为风险的要求。然而,由于SafeRL算法实现的高度复杂性,跨领域方法的整合构成严峻挑战。这导致当前SafeRL研究领域缺乏统一高效的学习框架。本研究提出一个旨在加速SafeRL研究的基础框架。该综合性框架涵盖多个强化学习领域的算法集,并着重强化安全要素。我们的工作致力于使SafeRL相关研究流程更加简化高效,从而推动AI安全领域的深入探索。项目已发布于:https://github.com/PKU-Alignment/omnisafe。