In this work, we address the cooperation problem among large language model (LLM) based embodied agents, where agents must cooperate to achieve a common goal. Previous methods often execute actions extemporaneously and incoherently, without long-term strategic and cooperative planning, leading to redundant steps, failures, and even serious repercussions in complex tasks like search-and-rescue missions where discussion and cooperative plan are crucial. To solve this issue, we propose Cooperative Plan Optimization (CaPo) to enhance the cooperation efficiency of LLM-based embodied agents. Inspired by human cooperation schemes, CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution. In the first phase, all agents analyze the task, discuss, and cooperatively create a meta-plan that decomposes the task into subtasks with detailed steps, ensuring a long-term strategic and coherent plan for efficient coordination. In the second phase, agents execute tasks according to the meta-plan and dynamically adjust it based on their latest progress (e.g., discovering a target object) through multi-turn discussions. This progress-based adaptation eliminates redundant actions, improving the overall cooperation efficiency of agents. Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.
翻译:本研究针对基于大语言模型(LLM)的具身智能体之间的协作问题展开,其中智能体必须通过协作实现共同目标。现有方法通常采取即时且不连贯的动作执行方式,缺乏长期战略性与协作性规划,导致在搜索救援等需要充分讨论与协同规划的关键复杂任务中,出现步骤冗余、任务失败甚至严重后果。为解决该问题,我们提出协同规划优化方法(CaPo),以提升基于LLM的具身智能体的协作效率。受人类协作机制启发,CaPo通过两个阶段提升协作效率:1)元规划生成,2)进度自适应的元规划与执行。在第一阶段,所有智能体分析任务、展开讨论并协同创建元规划,将任务分解为包含详细步骤的子任务,从而形成保证高效协调的长期战略性与连贯性规划。在第二阶段,智能体依据元规划执行任务,并通过多轮讨论根据最新进展(例如发现目标物体)动态调整规划。这种基于进度的自适应机制消除了冗余动作,从而显著提升了智能体的整体协作效率。在ThreeDworld多智能体运输任务与通信性观察协助任务上的实验结果表明,相较于现有最优方法,CaPo实现了显著更高的任务完成率与执行效率。