Imitation learning from human demonstrations can teach robots complex manipulation skills, but is time-consuming and labor intensive. In contrast, Task and Motion Planning (TAMP) systems are automated and excel at solving long-horizon tasks, but they are difficult to apply to contact-rich tasks. In this paper, we present Human-in-the-Loop Task and Motion Planning (HITL-TAMP), a novel system that leverages the benefits of both approaches. The system employs a TAMP-gated control mechanism, which selectively gives and takes control to and from a human teleoperator. This enables the human teleoperator to manage a fleet of robots, maximizing data collection efficiency. The collected human data is then combined with an imitation learning framework to train a TAMP-gated policy, leading to superior performance compared to training on full task demonstrations. We compared HITL-TAMP to a conventional teleoperation system -- users gathered more than 3x the number of demos given the same time budget. Furthermore, proficient agents (75\%+ success) could be trained from just 10 minutes of non-expert teleoperation data. Finally, we collected 2.1K demos with HITL-TAMP across 12 contact-rich, long-horizon tasks and show that the system often produces near-perfect agents. Videos and additional results at https://hitltamp.github.io .
翻译:人类演示的模仿学习可以教会机器人复杂的操作技能,但耗时且劳动密集。相比之下,任务与运动规划(TAMP)系统自动化程度高且擅长解决长时域任务,但难以应用于接触密集型任务。本文提出了一种新型系统——人机协同任务与运动规划(HITL-TAMP),融合了两种方法的优势。该系统采用TAMP门控控制机制,能够选择性地向人类遥操作员授予或收回控制权,从而让人类遥操作员管理多台机器人,最大化数据收集效率。收集到的人类数据随后与模仿学习框架结合,用于训练TAMP门控策略,相比基于完整任务演示的训练方法展现出更优性能。我们将HITL-TAMP与传统遥操作系统进行对比:在相同时间预算下,用户收集的演示数量提升超过3倍。此外,仅需10分钟的非专家遥操作数据即可训练出高技能智能体(成功率75%以上)。最终,我们利用HITL-TAMP在12项接触密集型长时域任务中收集了2100个演示,结果表明该系统常能生成近乎完美的智能体。视频及其他结果请见https://hitltamp.github.io。