Language models (LMs) possess a strong capability to comprehend natural language, making them effective in translating human instructions into detailed plans for simple robot tasks. Nevertheless, it remains a significant challenge to handle long-horizon tasks, especially in subtask identification and allocation for cooperative heterogeneous robot teams. To address this issue, we propose a Language Model-Driven Multi-Agent PDDL Planner (LaMMA-P), a novel multi-agent task planning framework that achieves state-of-the-art performance on long-horizon tasks. LaMMA-P integrates the strengths of the LMs' reasoning capability and the traditional heuristic search planner to achieve a high success rate and efficiency while demonstrating strong generalization across tasks. Additionally, we create MAT-THOR, a comprehensive benchmark that features household tasks with two different levels of complexity based on the AI2-THOR environment. The experimental results demonstrate that LaMMA-P achieves a 105% higher success rate and 36% higher efficiency than existing LM-based multi-agent planners. The experimental videos, code, and datasets of this work as well as the detailed prompts used in each module are available at https://lamma-p.github.io.
翻译:语言模型具备强大的自然语言理解能力,能有效将人类指令转化为简单机器人任务的详细规划。然而,处理长时程任务,特别是在异构机器人协作团队中进行子任务识别与分配,仍面临重大挑战。为解决此问题,本文提出一种语言模型驱动的多智能体PDDL规划器(LaMMA-P),这是一种新型多智能体任务规划框架,在长时程任务上实现了最先进的性能。LaMMA-P融合了语言模型的推理能力与传统启发式搜索规划器的优势,在实现高成功率和效率的同时,展现出强大的跨任务泛化能力。此外,我们基于AI2-THOR环境创建了MAT-THOR综合基准测试集,其中包含具有两种不同复杂度的家庭任务。实验结果表明,与现有基于语言模型的多智能体规划器相比,LaMMA-P的成功率提升105%,效率提高36%。本工作的实验视频、代码、数据集以及各模块使用的详细提示词均发布于https://lamma-p.github.io。