Although large language models (LLMs) have achieved significant success in various tasks, they often struggle with hallucination problems, especially in scenarios requiring deep and responsible reasoning. These issues could be partially addressed by introducing external knowledge graphs (KG) in LLM reasoning. In this paper, we propose a new LLM-KG integrating paradigm ``$\hbox{LLM}\otimes\hbox{KG}$'' which treats the LLM as an agent to interactively explore related entities and relations on KGs and perform reasoning based on the retrieved knowledge. We further implement this paradigm by introducing a new approach called Think-on-Graph (ToG), in which the LLM agent iteratively executes beam search on KG, discovers the most promising reasoning paths, and returns the most likely reasoning results. We use a number of well-designed experiments to examine and illustrate the following advantages of ToG: 1) compared with LLMs, ToG has better deep reasoning power; 2) ToG has the ability of knowledge traceability and knowledge correctability by leveraging LLMs reasoning and expert feedback; 3) ToG provides a flexible plug-and-play framework for different LLMs, KGs and prompting strategies without any additional training cost; 4) the performance of ToG with small LLM models could exceed large LLM such as GPT-4 in certain scenarios and this reduces the cost of LLM deployment and application. As a training-free method with lower computational cost and better generality, ToG achieves overall SOTA in 6 out of 9 datasets where most previous SOTAs rely on additional training.
翻译:尽管大语言模型(LLMs)在各种任务中取得了显著成功,但在需要深度且负责任的推理场景中,它们常常面临幻觉问题。通过在LLM推理中引入外部知识图谱(KG),这些问题可得到部分解决。本文提出一种新的LLM-KG融合范式“$\hbox{LLM}\otimes\hbox{KG}$”,该范式将LLM视为智能体,使其能够在知识图谱上交互式探索相关实体与关系,并基于检索到的知识进行推理。我们进一步通过引入名为Think-on-Graph(ToG)的新方法实现该范式。在ToG中,LLM智能体在知识图谱上迭代执行束搜索,发现最有可能的推理路径,并返回最可能的推理结果。我们通过一系列精心设计的实验验证并阐释了ToG的以下优势:1)与LLMs相比,ToG具备更强的深度推理能力;2)ToG通过整合LLM推理与专家反馈,具备知识可追溯性与知识可纠正性;3)ToG为不同LLMs、知识图谱及提示策略提供了灵活的即插即用框架,无需额外训练成本;4)在特定场景下,采用小型LLM模型的ToG性能可超越GPT-4等大型LLM,从而降低LLM部署与应用成本。作为一种无训练方法,ToG具有更低计算成本与更优通用性,在9个数据集中的6个上取得了整体最优结果(SOTA),而此前大多数SOTA方法依赖额外训练。