Agent Planning with World Knowledge Model

Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ``real'' physical world. Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric World Knowledge Model (WKM) to facilitate agent planning. Concretely, we steer the agent model to self-synthesize knowledge from both expert and sampled trajectories. Then we develop WKM, providing prior task knowledge to guide the global planning and dynamic state knowledge to assist the local planning. Experimental results on three complex real-world simulated datasets with three state-of-the-art open-source LLMs, Mistral-7B, Gemma-7B, and Llama-3-8B, demonstrate that our method can achieve superior performance compared to various strong baselines. Besides, we analyze to illustrate that our WKM can effectively alleviate the blind trial-and-error and hallucinatory action issues, providing strong support for the agent's understanding of the world. Other interesting findings include: 1) our instance-level task knowledge can generalize better to unseen tasks, 2) weak WKM can guide strong agent model planning, and 3) unified WKM training has promising potential for further development. The code is available at https://github.com/zjunlp/WKM.

翻译：近期，直接使用大语言模型（LLM）作为智能体模型来执行交互式规划任务的研究已取得显著成果。然而，由于对“真实”物理世界的理解不足，这些方法在全局规划中仍存在盲目试错，在局部规划中仍会产生幻觉性动作。受人类心智中世界知识模型的启发——该模型在任务前提供全局先验知识，并在任务过程中维护局部动态知识——本文引入参数化的世界知识模型（WKM）以促进智能体规划。具体而言，我们引导智能体模型从专家轨迹和采样轨迹中自合成知识。随后，我们构建了WKM，其提供先验任务知识以指导全局规划，并提供动态状态知识以辅助局部规划。在三个复杂的真实世界模拟数据集上，使用三种最先进的开源LLM（Mistral-7B、Gemma-7B和Llama-3-8B）进行的实验结果表明，我们的方法相较于多种强基线模型能够实现更优的性能。此外，分析表明我们的WKM能有效缓解盲目试错和幻觉性动作问题，为智能体理解世界提供了有力支持。其他有趣的发现包括：1）我们的实例级任务知识能更好地泛化至未见任务；2）较弱的WKM可以指导较强的智能体模型进行规划；3）统一的WKM训练具有进一步发展的潜力。代码发布于 https://github.com/zjunlp/WKM。