Recent large language models (LLMs) are promising for making decisions in grounded environments. However, LLMs frequently fail in complex decision-making tasks due to the misalignment between the pre-trained knowledge in LLMs and the actual rules in the environment. Existing methods require either costly gradient computation or lengthy in-context demonstrations. In this paper, we propose AutoPlan, an approach to guide LLM-based agents to accomplish interactive decision-making tasks. AutoPlan augments the LLM prompt with a task-solving plan and optimizes it through iterative experience collection and reflection. Our experiments show that AutoPlan, though using no in-context demonstrations, achieves success rates on par with the baselines using human-written demonstrations on ALFWorld and even outperforms them by 8% on HotpotQA. The code is available at https://github.com/owaski/AutoPlan.
翻译:近期的大语言模型在具身环境中进行决策展现出潜力。然而,由于模型预训练知识与环境实际规则之间的错配,大语言模型在复杂决策任务中频繁失败。现有方法需要高昂的梯度计算成本或冗长的上下文示例。本文提出AutoPlan方法,用于引导基于大语言模型的智能体完成交互式决策任务。AutoPlan通过将任务解决计划注入提示词,并借助迭代式经验收集与反思机制优化该计划。实验表明,尽管未使用任何上下文示例,AutoPlan在ALFWorld上能达到与使用人工编写示例的基线方法相当的成功率,在HotpotQA上甚至以8%的优势超越基线。代码已开源至https://github.com/owaski/AutoPlan。