While large language models (LLMs) have demonstrated impressive performance on a range of decision-making tasks, they rely on simple acting processes and fall short of broad deployment as autonomous agents. We introduce LATS (Language Agent Tree Search), a general framework that synergizes the capabilities of LLMs in planning, acting, and reasoning. Drawing inspiration from Monte Carlo tree search in model-based reinforcement learning, LATS employs LLMs as agents, value functions, and optimizers, repurposing their latent strengths for enhanced decision-making. What is crucial in this method is the use of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism that moves beyond the limitations of existing techniques. Our experimental evaluation across diverse domains, such as programming, HotPotQA, and WebShop, illustrates the applicability of LATS for both reasoning and acting. In particular, LATS achieves 94.4% for programming on HumanEval with GPT-4 and an average score of 75.9 for web browsing on WebShop with GPT-3.5, demonstrating the effectiveness and generality of our method.
翻译:尽管大语言模型在一系列决策任务上展现出显著性能,但其仍依赖简单的执行流程,难以作为自主智能体广泛部署。我们提出语言智能体树搜索——一个协同大语言模型在规划、执行与推理能力的通用框架。受基于模型的强化学习中蒙特卡洛树搜索的启发,LATS将大语言模型赋予智能体、价值函数与优化器的多重角色,通过重利用其潜在优势实现增强型决策。该方法的关键在于借助外部环境反馈机制,构建超越现有技术局限性的审慎自适应问题解决框架。我们在编程任务、HotPotQA与WebShop等多元领域的实验评估表明,LATS兼具推理与执行能力。具体而言,采用GPT-4的LATS在HumanEval编程基准上取得94.4%的准确率,采用GPT-3.5的LATS在WebShop网页浏览任务中平均得分75.9,充分验证了本方法的有效性与通用性。