Large Language Models (LLMs) are increasingly being used for interactive decision-making tasks requiring planning and adapting to the environment. Recent works employ LLMs-as-agents in broadly two ways: iteratively determining the next action (iterative executors) or generating plans and executing sub-tasks using LLMs (plan-and-execute). However, these methods struggle with task complexity, as the inability to execute any sub-task may lead to task failure. To address these shortcomings, we introduce As-Needed Decomposition and Planning for complex Tasks (ADaPT), an approach that explicitly plans and decomposes complex sub-tasks as-needed, i.e., when the LLM is unable to execute them. ADaPT recursively decomposes sub-tasks to adapt to both task complexity and LLM capability. Our results demonstrate that ADaPT substantially outperforms established strong baselines, achieving success rates up to 28.3% higher in ALFWorld, 27% in WebShop, and 33% in TextCraft -- a novel compositional dataset that we introduce. Through extensive analysis, we illustrate the importance of multilevel decomposition and establish that ADaPT dynamically adjusts to the capabilities of the executor LLM as well as to task complexity.
翻译:摘要:大语言模型(LLMs)正越来越多地被用于需要规划与适应环境的交互式决策任务。近期研究以两种主要方式使用LLMs作为智能体:迭代确定下一步动作(迭代执行器),或利用LLMs生成计划并执行子任务(规划与执行)。然而,这些方法在处理任务复杂性时存在局限——任何子任务的执行失败都可能导致整体任务失败。为解决上述问题,我们提出复杂任务的按需分解与规划方法(ADaPT),该方法在LLM无法执行子任务时进行显式规划与按需分解。ADaPT通过递归分解子任务来适应任务复杂度与LLM能力。实验结果表明,ADaPT显著优于已有强基线方法:在ALFWorld、WebShop及我们新提出的组合式数据集TextCraft中,成功率分别提升高达28.3%、27%和33%。通过广泛分析,我们阐明了多层次分解的重要性,并证实ADaPT能动态适应执行LLM的能力以及任务复杂度。