Humanoid robots capable of autonomous operation in diverse environments have long been a goal for roboticists. However, autonomous manipulation by humanoid robots has largely been restricted to one specific scene, primarily due to the difficulty of acquiring generalizable skills. Recent advances in 3D visuomotor policies, such as the 3D Diffusion Policy (DP3), have shown promise in extending these capabilities to wilder environments. However, 3D visuomotor policies often rely on camera calibration and point-cloud segmentation, which present challenges for deployment on mobile robots like humanoids. In this work, we introduce the Improved 3D Diffusion Policy (iDP3), a novel 3D visuomotor policy that eliminates these constraints by leveraging egocentric 3D visual representations. We demonstrate that iDP3 enables a full-sized humanoid robot to autonomously perform skills in diverse real-world scenarios, using only data collected in the lab. Videos are available at: https://humanoid-manipulation.github.io
翻译:能够在多样化环境中自主运行的人形机器人长期以来一直是机器人学家的目标。然而,人形机器人的自主操作大多局限于特定场景,这主要源于获取可泛化技能的困难。三维视觉运动策略(如三维扩散策略DP3)的最新进展,已显示出将这些能力扩展到更广泛环境的潜力。然而,三维视觉运动策略通常依赖相机标定与点云分割,这对人形机器人等移动机器人的部署提出了挑战。本工作提出了改进三维扩散策略(iDP3),这是一种新颖的三维视觉运动策略,它通过利用以自我为中心的三维视觉表征消除了这些限制。我们证明,iDP3能使全尺寸人形机器人仅使用实验室收集的数据,在多样化的真实场景中自主执行技能。演示视频可见:https://humanoid-manipulation.github.io