The ability of Language Models (LMs) to understand natural language makes them a powerful tool for parsing human instructions into task plans for autonomous robots. Unlike traditional planning methods that rely on domain-specific knowledge and handcrafted rules, LMs generalize from diverse data and adapt to various tasks with minimal tuning, acting as a compressed knowledge base. However, LMs in their standard form face challenges with long-horizon tasks, particularly in partially observable multi-agent settings. We propose an LM-based Long-Horizon Planner for Multi-Agent Robotics (LLaMAR), a cognitive architecture for planning that achieves state-of-the-art results in long-horizon tasks within partially observable environments. LLaMAR employs a plan-act-correct-verify framework, allowing self-correction from action execution feedback without relying on oracles or simulators. Additionally, we present MAP-THOR, a comprehensive test suite encompassing household tasks of varying complexity within the AI2-THOR environment. Experiments show that LLaMAR achieves a 30% higher success rate compared to other state-of-the-art LM-based multi-agent planners.
翻译:语言模型(LMs)理解自然语言的能力使其成为将人类指令解析为自主机器人任务规划的强大工具。与依赖领域特定知识和手工规则的传统规划方法不同,语言模型从多样化数据中泛化,并以最少的调适适应各种任务,充当一个压缩的知识库。然而,标准形式的语言模型在应对长时程任务时面临挑战,尤其是在部分可观测的多智能体场景中。我们提出了一种基于语言模型的多智能体机器人长时程规划器(LLaMAR),这是一种用于规划的认知架构,在部分可观测环境中的长时程任务上取得了最先进的成果。LLaMAR采用"规划-执行-修正-验证"框架,允许根据动作执行反馈进行自我修正,而无需依赖预言机或模拟器。此外,我们提出了MAP-THOR,这是一个在AI2-THOR环境中涵盖不同复杂度家务任务的综合测试套件。实验表明,与其他最先进的基于语言模型的多智能体规划器相比,LLaMAR实现了高出30%的成功率。