We propose DemoDiffusion, a simple method for enabling robots to perform manipulation tasks by imitating a single human demonstration, without requiring task-specific training or paired human-robot data. Our approach is based on two insights. First, the hand motion in a human demonstration provides a useful prior for the robot's end-effector trajectory, which we can convert into a rough open-loop robot motion trajectory via kinematic retargeting. Second, while this retargeted motion captures the overall structure of the task, it may not align well with plausible robot actions in-context. To address this, we leverage a pre-trained generalist diffusion policy to modify the trajectory, ensuring it both follows the human motion and remains within the distribution of plausible robot actions. Unlike approaches based on online reinforcement learning or paired human-robot data, our method enables robust adaptation to new tasks and scenes with minimal effort. In real-world experiments across 8 diverse manipulation tasks, DemoDiffusion achieves 83.8\% average success rate, compared to 13.8\% for the pre-trained policy and 52.5\% for kinematic retargeting, succeeding even on tasks where the pre-trained generalist policy fails entirely. Project page: https://demodiffusion.github.io/
翻译:我们提出DemoDiffusion,一种简单的方法,使机器人能够通过模仿单次人类示范来执行操作任务,而无需针对特定任务进行训练或配对的人类-机器人数据。我们的方法基于两个见解。首先,人类示范中的手部运动为机器人的末端执行器轨迹提供了有用的先验,我们可以通过运动学重定位将其转换为粗略的开环机器人运动轨迹。其次,虽然这种重定位运动捕捉到了任务的整体结构,但它可能无法在上下文中与合理的机器人动作良好对齐。为了解决这个问题,我们利用预训练的通用扩散策略来修改轨迹,确保它既遵循人类运动,又保持在合理机器人动作的分布内。与基于在线强化学习或配对人类-机器人数据的方法不同,我们的方法能够以最小的努力稳健地适应新任务和场景。在8个多样化操作任务的真实世界实验中,DemoDiffusion达到了83.8%的平均成功率,而预训练策略为13.8%,运动学重定位为52.5%,甚至能在预训练通用策略完全失败的任务上成功。项目页面:https://demodiffusion.github.io/