Recent years in robotics and imitation learning have shown remarkable progress in training large-scale foundation models by leveraging data across a multitude of embodiments. The success of such policies might lead us to wonder: just how diverse can the robots in the training set be while still facilitating positive transfer? In this work, we study this question in the context of heterogeneous embodiments, examining how even seemingly very different domains, such as robotic navigation and manipulation, can provide benefits when included in the training data for the same model. We train a single goal-conditioned policy that is capable of controlling robotic arms, quadcopters, quadrupeds, and mobile bases. We then investigate the extent to which transfer can occur across navigation and manipulation on these embodiments by framing them as a single goal-reaching task. We find that co-training with navigation data can enhance robustness and performance in goal-conditioned manipulation with a wrist-mounted camera. We then deploy our policy trained only from navigation-only and static manipulation-only data on a mobile manipulator, showing that it can control a novel embodiment in a zero-shot manner. These results provide evidence that large-scale robotic policies can benefit from data collected across various embodiments. Further information and robot videos can be found on our project website http://extreme-cross-embodiment.github.io.
翻译:近年来,机器人学与模仿学习领域通过利用跨多种具身形态的数据训练大规模基础模型取得了显著进展。此类策略的成功不禁引发思考:训练集中机器人的多样性究竟能达到何种程度,同时仍能促进正向迁移?本研究聚焦异质具身形态背景下的这一问题,探究诸如机器人导航与操控等看似迥异的领域,如何能为同一模型的训练数据带来裨益。我们训练了一个单一目标条件策略,该策略能够控制机械臂、四旋翼飞行器、四足机器人及移动底盘。通过将这些具身形态统一为单一目标到达任务,我们深入研究了导航与操控之间的跨任务迁移程度。实验发现,与导航数据联合训练可提升腕部相机驱动的目标条件操控的鲁棒性与性能。进一步地,我们将仅由纯导航数据与静态操控数据训练的策略部署于移动操控器上,验证了其能以零样本方式控制新型具身形态。这些结果表明,大规模机器人策略可从多种具身形态采集的数据中获益。更多信息与机器人演示视频详见项目网站http://extreme-cross-embodiment.github.io。