Retrieval-Augmented Generation (RAG) has emerged as a dominant paradigm for mitigating hallucinations in Large Language Models (LLMs) by incorporating external knowledge. Nevertheless, effectively integrating and interpreting key evidence scattered across noisy documents remains a critical challenge for existing RAG systems. In this paper, we propose GraphAnchor, a novel Graph-Anchored Knowledge Indexing approach that reconceptualizes graph structures from static knowledge representations into active, evolving knowledge indices. GraphAnchor incrementally updates a graph during iterative retrieval to anchor salient entities and relations, yielding a structured index that guides the LLM in evaluating knowledge sufficiency and formulating subsequent subqueries. The final answer is generated by jointly leveraging all retrieved documents and the final evolved graph. Experiments on four multi-hop question answering benchmarks demonstrate the effectiveness of GraphAnchor, and reveal that GraphAnchor modulates the LLM's attention to more effectively associate key information distributed in retrieved documents. All code and data are available at https://github.com/NEUIR/GraphAnchor.
翻译:检索增强生成(RAG)已成为通过整合外部知识来缓解大型语言模型(LLM)幻觉的主流范式。然而,对于现有RAG系统而言,如何有效整合并解读分散在噪声文档中的关键证据仍然是一个关键挑战。本文提出GraphAnchor,一种新颖的基于图锚定的知识索引方法,该方法将图结构从静态知识表示重新概念化为动态演化的知识索引。GraphAnchor在迭代检索过程中逐步更新图,以锚定显著实体及其关系,从而构建出一个结构化索引,用以指导LLM评估知识充分性并制定后续子查询。最终答案通过联合利用所有检索到的文档以及最终演化完成的图来生成。在四个多跳问答基准测试上的实验证明了GraphAnchor的有效性,并揭示出GraphAnchor能够调节LLM的注意力,使其更有效地关联检索文档中分布的关键信息。所有代码与数据均可在 https://github.com/NEUIR/GraphAnchor 获取。