Experimentation on real robots is demanding in terms of time and costs. For this reason, a large part of the reinforcement learning (RL) community uses simulators to develop and benchmark algorithms. However, insights gained in simulation do not necessarily translate to real robots, in particular for tasks involving complex interactions with the environment. The Real Robot Challenge 2022 therefore served as a bridge between the RL and robotics communities by allowing participants to experiment remotely with a real robot - as easily as in simulation. In the last years, offline reinforcement learning has matured into a promising paradigm for learning from pre-collected datasets, alleviating the reliance on expensive online interactions. We therefore asked the participants to learn two dexterous manipulation tasks involving pushing, grasping, and in-hand orientation from provided real-robot datasets. An extensive software documentation and an initial stage based on a simulation of the real set-up made the competition particularly accessible. By giving each team plenty of access budget to evaluate their offline-learned policies on a cluster of seven identical real TriFinger platforms, we organized an exciting competition for machine learners and roboticists alike. In this work we state the rules of the competition, present the methods used by the winning teams and compare their results with a benchmark of state-of-the-art offline RL algorithms on the challenge datasets.
翻译:在真实机器人上进行实验在时间和成本方面要求极高。因此,强化学习领域的大部分研究人员使用模拟器来开发和基准测试算法。然而,从模拟中获得的洞见并不一定能迁移到真实机器人上,尤其是在涉及与环境复杂交互的任务中。因此,2022年真实机器人挑战赛作为强化学习和机器人社区之间的桥梁,允许参与者像在模拟中一样轻松地远程操作真实机器人。近年来,离线强化学习已发展成为一种有前景的范式,能够从预收集的数据集中学习,从而减轻对昂贵的在线交互的依赖。因此,我们要求参与者从提供的真实机器人数据集中学习两项灵巧操作任务,包括推、抓取以及手内物体定向。详尽的软件文档和基于真实装置模拟的初始阶段使这次竞赛特别易于参与。通过为每个团队提供充足的使用预算,让他们在七个相同的真实TriFinger平台上评估其离线学习策略,我们为机器学习者和机器人研究者共同组织了一场激动人心的竞赛。在本文中,我们阐述了竞赛规则,介绍了获胜队伍使用的方法,并在挑战数据集上将他们的结果与最先进的离线强化学习算法的基准进行了比较。