Diffusion generative modeling has become a promising approach for learning robotic manipulation tasks from stochastic human demonstrations. In this paper, we present Diffusion-EDFs, a novel SE(3)-equivariant diffusion-based approach for visual robotic manipulation tasks. We show that our proposed method achieves remarkable data efficiency, requiring only 5 to 10 human demonstrations for effective end-to-end training in less than an hour. Furthermore, our benchmark experiments demonstrate that our approach has superior generalizability and robustness compared to state-of-the-art methods. Lastly, we validate our methods with real hardware experiments. Project Website: https://sites.google.com/view/diffusion-edfs/home
翻译:扩散生成建模已成为从随机人类演示中学习机器人操控任务的一种有前景的方法。本文提出Diffusion-EDFs,一种新颖的基于SE(3)等变扩散的视觉机器人操控任务方法。我们证明,所提出的方法实现了显著的数据效率,仅需5至10次人类演示即可在不到一小时内完成有效的端到端训练。此外,我们的基准实验表明,与现有最先进方法相比,本方法具有更优的泛化能力和鲁棒性。最后,我们通过实际硬件实验验证了该方法的有效性。项目网站:https://sites.google.com/view/diffusion-edfs/home