In this paper, we investigate the retrieval-augmented generation (RAG) based on Knowledge Graphs (KGs) to improve the accuracy and reliability of Large Language Models (LLMs). Recent approaches suffer from insufficient and repetitive knowledge retrieval, tedious and time-consuming query parsing, and monotonous knowledge utilization. To this end, we develop a Hypothesis Knowledge Graph Enhanced (HyKGE) framework, which leverages LLMs' powerful reasoning capacity to compensate for the incompleteness of user queries, optimizes the interaction process with LLMs, and provides diverse retrieved knowledge. Specifically, HyKGE explores the zero-shot capability and the rich knowledge of LLMs with Hypothesis Outputs to extend feasible exploration directions in the KGs, as well as the carefully curated prompt to enhance the density and efficiency of LLMs' responses. Furthermore, we introduce the HO Fragment Granularity-aware Rerank Module to filter out noise while ensuring the balance between diversity and relevance in retrieved knowledge. Experiments on two Chinese medical multiple-choice question datasets and one Chinese open-domain medical Q&A dataset with two LLM turbos demonstrate the superiority of HyKGE in terms of accuracy and explainability.
翻译:本文研究了基于知识图谱(KGs)的检索增强生成(RAG)方法,以提升大语言模型(LLMs)的准确性与可靠性。现有方法存在知识检索不足且重复、查询解析繁琐耗时、知识利用单一等问题。为此,我们提出了假设知识图谱增强(HyKGE)框架,该框架利用LLMs强大的推理能力弥补用户查询的不完整性,优化与LLMs的交互过程,并提供多样化的检索知识。具体而言,HyKGE通过假设输出(Hypothesis Outputs)探索LLMs的零样本能力和丰富知识,在知识图谱中扩展可行的探索方向;同时借助精心设计的提示(prompt)增强LLMs回答的密度与效率。此外,我们引入了HO片段粒度感知重排序模块(HO Fragment Granularity-aware Rerank Module),以在过滤噪声的同时确保检索知识多样性与相关性之间的平衡。在两个中文医学选择题数据集和一个中文开放域医疗问答数据集上,结合两种LLM涡轮模型(LLM turbos)的实验表明,HyKGE在准确性与可解释性方面均具有优越性。