We propose DemoDiffusion, a simple method for enabling robots to perform manipulation tasks by imitating a single human demonstration, without requiring task-specific training or paired human-robot data. Our approach is based on two insights. First, the hand motion in a human demonstration provides a useful prior for the robot's end-effector trajectory, which we can convert into a rough open-loop robot motion trajectory via kinematic retargeting. Second, while this retargeted motion captures the overall structure of the task, it may not align well with plausible robot actions in-context. To address this, we leverage a pre-trained generalist diffusion policy to modify the trajectory, ensuring it both follows the human motion and remains within the distribution of plausible robot actions. Unlike approaches based on online reinforcement learning or paired human-robot data, our method enables robust adaptation to new tasks and scenes with minimal effort. In real-world experiments across 8 diverse manipulation tasks, DemoDiffusion achieves 83.8\% average success rate, compared to 13.8\% for the pre-trained policy and 52.5\% for kinematic retargeting, succeeding even on tasks where the pre-trained generalist policy fails entirely. Project page: https://demodiffusion.github.io/
翻译:我们提出DemoDiffusion,一种使机器人能够通过模仿单次人类演示来执行操作任务的简单方法,无需任务特定训练或配对的人机数据。我们的方法基于两个关键见解。首先,人类演示中的手部运动为机器人末端执行器的轨迹提供了有用的先验,我们可以通过运动学重定向将其转换为粗略的开环机器人运动轨迹。其次,虽然这种重定向运动捕捉了任务的整体结构,但它可能与上下文中的合理机器人动作不够吻合。为解决这一问题,我们利用预训练的通用扩散策略来修改轨迹,确保其既遵循人类运动又保持在合理机器人动作的分布范围内。与基于在线强化学习或配对人机数据的方法不同,我们的方法能够以最小代价实现对新任务和场景的鲁棒适应。在涵盖8个不同操作任务的真实世界实验中,DemoDiffusion实现了83.8%的平均成功率,而预训练策略为13.8%,运动学重定向为52.5%,甚至在预训练通用策略完全失败的任务上也取得了成功。项目页面:https://demodiffusion.github.io/