Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World

Nico Gürtler,Felix Widmaier,Cansu Sancaktar,Sebastian Blaes,Pavel Kolev,Stefan Bauer,Manuel Wüthrich,Markus Wulfmeier,Martin Riedmiller,Arthur Allshire,Qiang Wang,Robert McCarthy,Hangyeol Kim,Jongchan Baek,Wookyong Kwon,Shanliang Qian,Yasunori Toshimitsu,Mike Yan Michelis,Amirhossein Kazemipour,Arman Raayatsanati,Hehui Zheng,Barnabas Gavin Cangan,Bernhard Schölkopf,Georg Martius

from arxiv, Typo in author list fixed

Experimentation on real robots is demanding in terms of time and costs. For this reason, a large part of the reinforcement learning (RL) community uses simulators to develop and benchmark algorithms. However, insights gained in simulation do not necessarily translate to real robots, in particular for tasks involving complex interactions with the environment. The Real Robot Challenge 2022 therefore served as a bridge between the RL and robotics communities by allowing participants to experiment remotely with a real robot - as easily as in simulation. In the last years, offline reinforcement learning has matured into a promising paradigm for learning from pre-collected datasets, alleviating the reliance on expensive online interactions. We therefore asked the participants to learn two dexterous manipulation tasks involving pushing, grasping, and in-hand orientation from provided real-robot datasets. An extensive software documentation and an initial stage based on a simulation of the real set-up made the competition particularly accessible. By giving each team plenty of access budget to evaluate their offline-learned policies on a cluster of seven identical real TriFinger platforms, we organized an exciting competition for machine learners and roboticists alike. In this work we state the rules of the competition, present the methods used by the winning teams and compare their results with a benchmark of state-of-the-art offline RL algorithms on the challenge datasets.

翻译：在真实机器人上进行实验需要耗费大量时间和成本。因此，大部分强化学习（RL）社区使用仿真器来开发和评估算法。然而，在仿真中获得的见解并不一定能迁移到真实机器人上，尤其是涉及与复杂环境交互的任务。为此，2022年真实机器人挑战赛作为连接RL与机器人社区的桥梁，使参赛者能够像在仿真中一样便捷地远程操作真实机器人。近年来，离线强化学习已发展成为一种有前景的学习范式，能够利用预收集的数据集进行学习，从而减少对昂贵在线交互的依赖。因此，我们要求参赛者基于提供的真实机器人数据集，学习两项涉及推、抓取和手内定向的灵巧操作任务。详尽的软件文档和基于真实装置仿真的初始阶段设计，使得本次竞赛特别易于参与。通过给予每个团队充足的预算来评估其离线学习策略在七台相同真实TriFinger平台集群上的表现，我们为机器学习者和机器人专家共同组织了一场激动人心的竞赛。本文阐述了竞赛规则，介绍了获胜队伍使用的方法，并将其结果与挑战数据集上最新离线RL算法的基准进行了比较。