Imitation learning has shown great potential for enabling robots to acquire complex manipulation behaviors. However, these algorithms suffer from high sample complexity in long-horizon tasks, where compounding errors accumulate over the task horizons. We present PRIME (PRimitive-based IMitation with data Efficiency), a behavior primitive-based framework designed for improving the data efficiency of imitation learning. PRIME scaffolds robot tasks by decomposing task demonstrations into primitive sequences, followed by learning a high-level control policy to sequence primitives through imitation learning. Our experiments demonstrate that PRIME achieves a significant performance improvement in multi-stage manipulation tasks, with 10-34% higher success rates in simulation over state-of-the-art baselines and 20-48% on physical hardware.
翻译:模仿学习在使机器人获取复杂操控行为方面展现出巨大潜力。然而,这些算法在长时域任务中存在样本效率低下的问题,其复合误差会随任务时域累积。我们提出PRIME(基于基元的数据高效模仿学习框架),这是一种基于行为基元的框架,旨在提升模仿学习的数据效率。PRIME通过将任务演示分解为基元序列来构建机器人任务的支架,随后通过模仿学习训练高层控制策略以编排基元序列。实验表明,PRIME在多阶段操控任务中实现了显著的性能提升:在仿真环境中,相对于最先进的基线方法,成功率提升10-34%;在物理硬件上,成功率提升20-48%。