Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for LanguagE-Conditioned Multi-robot MAnipulation (LEMMA) focused on task allocation and long-horizon object manipulation based on human language instructions in a tabletop setting. LEMMA features 8 types of procedurally generated tasks with varying degree of complexity, some of which require the robots to use tools and pass tools to each other. For each task, we provide 800 expert demonstrations and human instructions for training and evaluations. LEMMA poses greater challenges compared to existing benchmarks, as it requires the system to identify each manipulator's limitations and assign sub-tasks accordingly while also handling strong temporal dependencies in each task. To address these challenges, we propose a modular hierarchical planning approach as a baseline. Our results highlight the potential of LEMMA for developing future language-conditioned multi-robot systems.
翻译:摘要:复杂操控任务通常需要具有互补能力的机器人协同合作。我们提出了一个面向语言条件下的多机器人操控(LEMMA)基准测试,重点研究基于桌面场景中人类语言指令的任务分配与长时域物体操控。LEMMA包含8类程序化生成的任务,具有不同的复杂程度,其中部分任务要求机器人使用工具并相互传递工具。针对每项任务,我们提供800条专家示范轨迹及人类语言指令用于训练与评估。与现有基准相比,LEMMA提出了更大挑战:系统需要识别各操作器的能力限制并据此分配子任务,同时处理任务中强时间依赖关系。为应对这些挑战,我们提出模块化分层规划方法作为基线方案。实验结果充分展现了LEMMA在开发未来语言条件下多机器人系统中的潜力。