Recent advancements on Large Language Models (LLMs) enable AI Agents to automatically generate and execute multi-step plans to solve complex tasks. However, since LLM's content generation process is hardly controllable, current LLM-based agents frequently generate invalid or non-executable plans, which jeopardizes the performance of the generated plans and corrupts users' trust in LLM-based agents. In response, this paper proposes a novel ``Formal-LLM'' framework for LLM-based agents by integrating the expressiveness of natural language and the precision of formal language. Specifically, the framework allows human users to express their requirements or constraints for the planning process as an automaton. A stack-based LLM plan generation process is then conducted under the supervision of the automaton to ensure that the generated plan satisfies the constraints, making the planning process controllable. We conduct experiments on both benchmark tasks and practical real-life tasks, and our framework achieves over 50% overall performance increase, which validates the feasibility and effectiveness of employing Formal-LLM to guide the plan generation of agents, preventing the agents from generating invalid and unsuccessful plans. Further, more controllable LLM-based agents can facilitate the broader utilization of LLM in application scenarios where high validity of planning is essential. The work is open-sourced at https://github.com/agiresearch/Formal-LLM.
翻译:摘要:近期大语言模型的进展使得AI智能体能够自动生成并执行多步骤计划以解决复杂任务。然而,由于LLM的内容生成过程难以控制,当前基于LLM的智能体频繁生成无效或不可执行的计划,这不仅损害了所生成计划的性能,也削弱了用户对LLM智能体的信任。为此,本文提出一种新颖的"Formal-LLM"框架,通过整合自然语言的表现力与形式语言的精确性来增强LLM智能体的可控性。具体而言,该框架允许人类用户将规划过程中的需求或约束表达为自动机,随后在自动机监督下执行基于栈的LLM计划生成过程,确保生成的计划满足约束条件,从而使规划过程可控。我们在基准任务和实际生活任务上进行了实验,框架整体性能提升超过50%,验证了利用Formal-LLM引导智能体计划生成、防止其产生无效或失败计划的可行性与有效性。此外,更可控的LLM智能体能够促进LLM在高规划有效性要求的应用场景中更广泛地应用。本工作已开源至https://github.com/agiresearch/Formal-LLM。