Recent studies have highlighted their proficiency in some simple tasks like writing and coding through various reasoning strategies. However, LLM agents still struggle with tasks that require comprehensive planning, a process that challenges current models and remains a critical research issue. In this study, we concentrate on travel planning, a Multi-Phases planning problem, that involves multiple interconnected stages, such as outlining, information gathering, and planning, often characterized by the need to manage various constraints and uncertainties. Existing reasoning approaches have struggled to effectively address this complex task. Our research aims to address this challenge by developing a human-like planning framework for LLM agents, i.e., guiding the LLM agent to simulate various steps that humans take when solving Multi-Phases problems. Specifically, we implement several strategies to enable LLM agents to generate a coherent outline for each travel query, mirroring human planning patterns. Additionally, we integrate Strategy Block and Knowledge Block into our framework: Strategy Block facilitates information collection, while Knowledge Block provides essential information for detailed planning. Through our extensive experiments, we demonstrate that our framework significantly improves the planning capabilities of LLM agents, enabling them to tackle the travel planning task with improved efficiency and effectiveness. Our experimental results showcase the exceptional performance of the proposed framework; when combined with GPT-4-Turbo, it attains $10\times$ the performance gains in comparison to the baseline framework deployed on GPT-4-Turbo.
翻译:近期研究表明,大语言模型通过多种推理策略在写作、编程等简单任务中展现出卓越能力。然而,面对需要综合规划的任务时,LLM智能体仍存在明显不足——这一过程对现有模型构成挑战,并成为关键研究议题。本研究聚焦于旅行规划这一多阶段规划问题,该任务涉及提纲拟定、信息收集、方案制定等多个相互关联的阶段,且常需处理各类约束条件与不确定性因素。现有推理方法难以有效应对此类复杂任务。本研究旨在通过构建类人规划框架来应对这一挑战,即引导LLM智能体模拟人类解决多阶段问题时的思维步骤。具体而言,我们通过多种策略使LLM智能体能够为每个旅行查询生成符合人类规划模式的连贯提纲。此外,我们在框架中整合了策略模块与知识模块:策略模块辅助信息收集,知识模块则为详细规划提供必要信息支撑。通过大量实验验证,本框架显著提升了LLM智能体的规划能力,使其能以更高效率与效能处理旅行规划任务。实验结果表明:当该框架与GPT-4-Turbo结合时,相较于基于GPT-4-Turbo的基准框架,其性能提升达到$10\times$。