Deep reinforcement learning (RL) has shown promising results in robot motion planning with first attempts in human-robot collaboration (HRC). However, a fair comparison of RL approaches in HRC under the constraint of guaranteed safety is yet to be made. We, therefore, present human-robot gym, a benchmark suite for safe RL in HRC. Our benchmark suite provides eight challenging, realistic HRC tasks in a modular simulation framework. Most importantly, human-robot gym includes a safety shield that provably guarantees human safety. We are, thereby, the first to provide a benchmark suite to train RL agents that adhere to the safety specifications of real-world HRC. This bridges a critical gap between theoretic RL research and its real-world deployment. Our evaluation of six tasks led to three key results: (a) the diverse nature of the tasks offered by human-robot gym creates a challenging benchmark for state-of-the-art RL methods, (b) incorporating expert knowledge in RL training in the form of an action-based reward can outperform the expert, and (c) our agents negligibly overfit to training data.
翻译:深度强化学习(RL)在机器人运动规划中已展现出有前景的结果,并在人机协作(HRC)领域进行了初步尝试。然而,在确保安全性的约束下,对HRC中RL方法进行公平比较仍有待实现。为此,我们提出了人机健身房(human-robot gym),一个用于HRC中安全RL的基准测试套件。我们的基准套件在一个模块化仿真框架中提供了八项具有挑战性且真实的HRC任务。最重要的是,人机健身房包含一个可证明保证人类安全的安全防护盾。由此,我们首次提供了一个基准套件,用于训练符合现实世界HRC安全规范的RL智能体。这填补了理论RL研究与其实际部署之间的关键空白。我们对六项任务的评估得出了三个关键结果:(a)人机健身房提供的多样化任务为最先进的RL方法创建了一个具有挑战性的基准;(b)在RL训练中以基于动作的奖励形式融入专家知识可以超越专家性能;(c)我们的智能体对训练数据的过拟合可忽略不计。