Unifying Large Language Models and Knowledge Graphs: A Roadmap

Large language models (LLMs), such as ChatGPT and GPT4, are making new waves in the field of natural language processing and artificial intelligence, due to their emergent ability and generalizability. However, LLMs are black-box models, which often fall short of capturing and accessing factual knowledge. In contrast, Knowledge Graphs (KGs), Wikipedia and Huapu for example, are structured knowledge models that explicitly store rich factual knowledge. KGs can enhance LLMs by providing external knowledge for inference and interpretability. Meanwhile, KGs are difficult to construct and evolving by nature, which challenges the existing methods in KGs to generate new facts and represent unseen knowledge. Therefore, it is complementary to unify LLMs and KGs together and simultaneously leverage their advantages. In this article, we present a forward-looking roadmap for the unification of LLMs and KGs. Our roadmap consists of three general frameworks, namely, 1) KG-enhanced LLMs, which incorporate KGs during the pre-training and inference phases of LLMs, or for the purpose of enhancing understanding of the knowledge learned by LLMs; 2) LLM-augmented KGs, that leverage LLMs for different KG tasks such as embedding, completion, construction, graph-to-text generation, and question answering; and 3) Synergized LLMs + KGs, in which LLMs and KGs play equal roles and work in a mutually beneficial way to enhance both LLMs and KGs for bidirectional reasoning driven by both data and knowledge. We review and summarize existing efforts within these three frameworks in our roadmap and pinpoint their future research directions.

翻译：大型语言模型（如ChatGPT和GPT4）凭借其涌现能力与泛化性，正在自然语言处理和人工智能领域掀起新浪潮。然而，LLM作为黑盒模型，在捕获和访问事实知识方面往往存在不足。相比之下，知识图谱（如维基百科和华为知识图谱）作为结构化知识模型，能够显式存储丰富的事实知识。知识图谱可通过提供外部知识增强LLM的推理能力和可解释性。同时，知识图谱本身构建困难且具有动态演化特性，这使得现有知识图谱方法在面对新事实生成和未见知识表征时面临挑战。因此，将LLM与知识图谱进行统一并协同发挥各自优势具有互补性。本文提出面向LLM与知识图谱统一的前瞻性路线图，包含三大通用框架：1）知识图谱增强的LLM——在LLM预训练和推理阶段引入知识图谱，或用于增强对LLM所学知识的理解；2）LLM增强的知识图谱——利用LLM处理知识图谱嵌入、补全、构建、图到文本生成及问答等任务；3）协同式LLM+知识图谱——LLM与知识图谱在双向数据与知识驱动推理中扮演同等角色，通过互惠机制同步增强两者性能。我们系统梳理了路线图中三大框架下的现有研究，并指出了未来研究方向。