Agent Planning with World Knowledge Model

Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ''real'' physical world. Imitating humans' mental world knowledge model which provides global prior knowledge before the task and maintains local dynamic knowledge during the task, in this paper, we introduce parametric World Knowledge Model (WKM) to facilitate agent planning. Concretely, we steer the agent model to self-synthesize knowledge from both expert and sampled trajectories. Then we develop WKM, providing prior task knowledge to guide the global planning and dynamic state knowledge to assist the local planning. Experimental results on three complex real-world simulated datasets with three state-of-the-art open-source LLMs, Mistral-7B, Gemma-7B, and Llama-3-8B, demonstrate that our method can achieve superior performance compared to various strong baselines. Besides, we analyze to illustrate that our WKM can effectively alleviate the blind trial-and-error and hallucinatory action issues, providing strong support for the agent's understanding of the world. Other interesting findings include: 1) our instance-level task knowledge can generalize better to unseen tasks, 2) weak WKM can guide strong agent model planning, and 3) unified WKM training has promising potential for further development. Code will be available at https://github.com/zjunlp/WKM.

翻译：近期，直接利用大型语言模型（LLMs）作为智能体模型来执行交互式规划任务的研究取得了显著成果。然而，由于对“真实”物理世界的理解不足，这些方法在全局规划中仍存在盲目试错，在局部规划中仍会产生幻觉性动作。受人类心智世界知识模型的启发——该模型在任务前提供全局先验知识，并在任务中维护局部动态知识——本文引入了参数化的世界知识模型（WKM）以促进智能体规划。具体而言，我们引导智能体模型从专家轨迹和采样轨迹中自我合成知识。随后，我们构建了WKM，它提供先验任务知识以指导全局规划，并提供动态状态知识以辅助局部规划。在三个复杂的真实世界模拟数据集上，使用三种先进的开源LLM（Mistral-7B、Gemma-7B和Llama-3-8B）进行的实验结果表明，与多种强基线方法相比，我们的方法能够实现更优的性能。此外，分析表明我们的WKM能有效缓解盲目试错和幻觉性动作问题，为智能体理解世界提供了有力支持。其他有趣的发现包括：1）我们的实例级任务知识能更好地泛化到未见任务；2）较弱的WKM可以指导较强的智能体模型进行规划；3）统一的WKM训练具有进一步发展的潜力。代码将在 https://github.com/zjunlp/WKM 发布。