While large language models (LLMs) have demonstrated impressive performance on a range of decision-making tasks, they rely on simple acting processes and fall short of broad deployment as autonomous agents. We introduce LATS (Language Agent Tree Search), a general framework that synergizes the capabilities of LLMs in planning, acting, and reasoning. Drawing inspiration from Monte Carlo tree search in model-based reinforcement learning, LATS employs LLMs as agents, value functions, and optimizers, repurposing their latent strengths for enhanced decision-making. What is crucial in this method is the use of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism that moves beyond the limitations of existing techniques. Our experimental evaluation across diverse domains, such as programming, HotPotQA, and WebShop, illustrates the applicability of LATS for both reasoning and acting. In particular, LATS achieves 94.4\% for programming on HumanEval with GPT-4 and an average score of 75.9 for web browsing on WebShop with GPT-3.5, demonstrating the effectiveness and generality of our method.
翻译:尽管大语言模型(LLMs)在一系列决策任务中展现出卓越性能,但它们仍依赖于简单的行动流程,难以作为自主智能体广泛部署。我们提出语言智能体树搜索(LATS, Language Agent Tree Search)——一个通用框架,协同发挥LLMs在规划、行动与推理中的能力。该框架受基于模型的强化学习中蒙特卡洛树搜索的启发,将LLMs同时用作智能体、价值函数与优化器,重新利用其潜在优势以增强决策能力。该方法的核心在于利用外部环境反馈机制,提供一种更具审慎性与适应性的问题求解机制,突破了现有技术的局限。我们在编程、HotPotQA及WebShop等不同领域的实验评估表明,LATS在推理与行动任务中均具有广泛适用性。具体而言,采用GPT-4的LATS在HumanEval编程任务中达到94.4%的准确率,采用GPT-3.5的LATS在WebShop网页浏览任务中取得75.9的平均分,充分验证了本方法的有效性与通用性。