Large language models with retrieval-augmented generation encounter a pivotal challenge in intricate retrieval tasks, e.g., multi-hop question answering, which requires the model to navigate across multiple documents and generate comprehensive responses based on fragmented information. To tackle this challenge, we introduce a novel Knowledge Graph-based RAG framework with a hierarchical knowledge retriever, termed KG-Retriever. The retrieval indexing in KG-Retriever is constructed on a hierarchical index graph that consists of a knowledge graph layer and a collaborative document layer. The associative nature of graph structures is fully utilized to strengthen intra-document and inter-document connectivity, thereby fundamentally alleviating the information fragmentation problem and meanwhile improving the retrieval efficiency in cross-document retrieval of LLMs. With the coarse-grained collaborative information from neighboring documents and concise information from the knowledge graph, KG-Retriever achieves marked improvements on five public QA datasets, showing the effectiveness and efficiency of our proposed RAG framework.
翻译:采用检索增强生成的大语言模型在复杂检索任务(例如多跳问答)中面临一个关键挑战,该任务要求模型跨越多个文档进行导航,并基于碎片化信息生成全面的回答。为应对这一挑战,我们提出了一种新颖的基于知识图谱的检索增强生成框架,其包含一个分层知识检索器,称为KG-Retriever。KG-Retriever中的检索索引构建在一个分层索引图上,该图由知识图谱层和协作文档层组成。该框架充分利用图结构的关联特性,以增强文档内与文档间的连接性,从而从根本上缓解信息碎片化问题,同时提升大语言模型在跨文档检索中的检索效率。借助来自相邻文档的粗粒度协作信息以及知识图谱的简洁信息,KG-Retriever在五个公开问答数据集上取得了显著提升,证明了我们所提检索增强生成框架的有效性与高效性。