Although large language models (LLMs) have achieved significant success in various tasks, they often struggle with hallucination problems, especially in scenarios requiring deep and responsible reasoning. These issues could be partially addressed by introducing external knowledge graphs (KG) in LLM reasoning. In this paper, we propose a new LLM-KG integrating paradigm ``$\hbox{LLM}\otimes\hbox{KG}$'' which treats the LLM as an agent to interactively explore related entities and relations on KGs and perform reasoning based on the retrieved knowledge. We further implement this paradigm by introducing a new approach called Think-on-Graph (ToG), in which the LLM agent iteratively executes beam search on KG, discovers the most promising reasoning paths, and returns the most likely reasoning results. We use a number of well-designed experiments to examine and illustrate the following advantages of ToG: 1) compared with LLMs, ToG has better deep reasoning power; 2) ToG has the ability of knowledge traceability and knowledge correctability by leveraging LLMs reasoning and expert feedback; 3) ToG provides a flexible plug-and-play framework for different LLMs, KGs and prompting strategies without any additional training cost; 4) the performance of ToG with small LLM models could exceed large LLM such as GPT-4 in certain scenarios and this reduces the cost of LLM deployment and application. As a training-free method with lower computational cost and better generality, ToG achieves overall SOTA in 6 out of 9 datasets where most previous SOTAs rely on additional training.
翻译:尽管大语言模型(LLMs)在各种任务中取得了显著成功,但在需要深度和负责任的推理场景中,它们常常难以避免幻觉问题。通过在LLM推理中引入外部知识图谱(KG)可以部分缓解这些问题。本文提出了一种新的LLM-KG融合范式“$\hbox{LLM}\otimes\hbox{KG}$”,将LLM作为智能体,交互式地探索知识图谱上的相关实体与关系,并基于检索到的知识进行推理。我们进一步通过引入一种名为“思考之图”(Think-on-Graph,ToG)的新方法来实现这一范式,其中LLM代理在KG上迭代执行束搜索,发现最可靠的推理路径,并返回最可能的推理结果。通过一系列精心设计的实验,我们检验并阐释了ToG的以下优势:1)与LLMs相比,ToG具有更强的深度推理能力;2)ToG具备知识可追溯性和知识可修正性,能够利用LLM推理和专家反馈;3)ToG为不同的LLMs、KGs和提示策略提供了灵活的即插即用框架,无需额外训练成本;4)在特定场景下,使用小型LLM模型的ToG性能可超越GPT-4等大模型,从而降低LLM部署和应用成本。作为一种无训练方法,ToG具有更低计算成本和更好通用性,在9个数据集中的6个上实现了总体最优(SOTA),而此前的大多数SOTA方法依赖于额外训练。