Safety is a crucial property of every robotic platform: any control policy should always comply with actuator limits and avoid collisions with the environment and humans. In reinforcement learning, safety is even more fundamental for exploring an environment without causing any damage. While there are many proposed solutions to the safe exploration problem, only a few of them can deal with the complexity of the real world. This paper introduces a new formulation of safe exploration for reinforcement learning of various robotic tasks. Our approach applies to a wide class of robotic platforms and enforces safety even under complex collision constraints learned from data by exploring the tangent space of the constraint manifold. Our proposed approach achieves state-of-the-art performance in simulated high-dimensional and dynamic tasks while avoiding collisions with the environment. We show safe real-world deployment of our learned controller on a TIAGo++ robot, achieving remarkable performance in manipulation and human-robot interaction tasks.
翻译:安全性是每个机器人平台的关键属性:任何控制策略都必须始终遵守执行器限制,并避免与环境和人类发生碰撞。在强化学习中,安全性对于在探索环境时不造成任何损害更为基础。尽管针对安全探索问题提出了许多解决方案,但只有少数能够应对现实世界的复杂性。本文提出了一种新的安全探索公式,用于各种机器人任务的强化学习。我们的方法适用于广泛的机器人平台,并通过探索约束流形的切空间,即使是在从数据中学习的复杂碰撞约束下也能确保安全性。我们的方法在模拟的高维和动态任务中实现了最先进的性能,同时避免了与环境发生碰撞。我们展示了在TIAGo++机器人上实际部署所学控制器的安全性,在操作和人机交互任务中取得了显著性能。