Recent advances in imitation learning, particularly using generative modelling techniques like diffusion, have enabled policies to capture complex multi-modal action distributions. However, these methods often require large datasets and multiple inference steps for action generation, posing challenges in robotics where the cost for data collection is high and computation resources are limited. To address this, we introduce IMLE Policy, a novel behaviour cloning approach based on Implicit Maximum Likelihood Estimation (IMLE). IMLE Policy excels in low-data regimes, effectively learning from minimal demonstrations and requiring 38\% less data on average to match the performance of baseline methods in learning complex multi-modal behaviours. Its simple generator-based architecture enables single-step action generation, improving inference speed by 97.3\% compared to Diffusion Policy, while outperforming single-step Flow Matching. We validate our approach across diverse manipulation tasks in simulated and real-world environments, showcasing its ability to capture complex behaviours under data constraints. Videos and code are provided on our project page: https://imle-policy.github.io/.
翻译:近年来,模仿学习领域取得了显著进展,特别是利用扩散等生成建模技术,使得策略能够捕捉复杂的多模态动作分布。然而,这些方法通常需要大规模数据集和多个推理步骤来生成动作,这在机器人学中带来了挑战,因为数据收集成本高昂且计算资源有限。为解决这一问题,我们提出了IMLE策略,一种基于隐式最大似然估计(IMLE)的新型行为克隆方法。IMLE策略在低数据环境下表现出色,能够从少量演示中有效学习,在学习复杂多模态行为时,平均所需数据量比基线方法少38%,即可达到同等性能。其简单的基于生成器的架构支持单步动作生成,与扩散策略相比,推理速度提高了97.3%,同时性能优于单步流匹配。我们在模拟和真实环境中的多种操作任务上验证了该方法,展示了其在数据约束下捕捉复杂行为的能力。视频和代码已发布在项目页面:https://imle-policy.github.io/。