Graph-based Retrieval-Augmented Generation (GraphRAG) extends traditional RAG by using knowledge graphs (KGs) to give large language models (LLMs) a structured, semantically coherent context, yielding more grounded answers. However, GraphRAG reasoning process remains a black-box, limiting our ability to understand how specific pieces of structured knowledge influence the final output. Existing explainability (XAI) methods for RAG systems, designed for text-based retrieval, are limited to interpreting an LLM response through the relational structures among knowledge components, creating a critical gap in transparency and trustworthiness. To address this, we introduce XGRAG, a novel framework that generates causally grounded explanations for GraphRAG systems by employing graph-based perturbation strategies, to quantify the contribution of individual graph components on the model answer. We conduct extensive experiments comparing XGRAG against RAG-Ex, an XAI baseline for standard RAG, and evaluate its robustness across various question types, narrative structures and LLMs. Our results demonstrate a 14.81% improvement in explanation quality over the baseline RAG-Ex across NarrativeQA, FairyTaleQA, and TriviaQA, evaluated by F1-score measuring alignment between generated explanations and original answers. Furthermore, XGRAG explanations exhibit a strong correlation with graph centrality measures, validating its ability to capture graph structure. XGRAG provides a scalable and generalizable approach towards trustworthy AI through transparent, graph-based explanations that enhance the interpretability of RAG systems.
翻译:[translated abstract in Chinese]
基于图的检索增强生成(GraphRAG)通过使用知识图谱(KGs)为大型语言模型(LLMs)提供结构化、语义连贯的上下文,从而扩展了传统RAG,产生了更具依据性的答案。然而,GraphRAG的推理过程仍然是一个黑箱,限制了我们对特定结构化知识如何影响最终输出的理解。现有的RAG系统可解释性(XAI)方法专为基于文本的检索设计,仅限于通过知识组件间的关系结构来解释LLM响应,在透明度和可信度方面存在关键空白。为解决这一问题,我们提出了XGRAG,一种新颖的框架,通过采用基于图的扰动策略,量化单个图组件对模型答案的贡献,为GraphRAG系统生成因果上合理的解释。我们进行了广泛的实验,将XGRAG与标准RAG的XAI基线RAG-Ex进行比较,并在各种问题类型、叙事结构和LLMs上评估其鲁棒性。结果表明,在NarrativeQA、FairyTaleQA和TriviaQA上,通过衡量生成解释与原始答案对齐的F1分数评估,XGRAG的解释质量比基线RAG-Ex提升了14.81%。此外,XGRAG的解释与图中心性度量表现出强相关性,验证了其捕获图结构的能力。XGRAG通过透明的、基于图的解释增强了RAG系统的可解释性,为值得信赖的人工智能提供了一种可扩展且可泛化的方法。