Retrieval-Augmented Generation (RAG) systems combine dense retrievers and language models to ground LLM outputs in retrieved documents. However, the opacity of how these components interact creates challenges for deployment in high-stakes domains. We present RAG-E, an end-to-end explainability framework that quantifies retriever-generator alignment through mathematically grounded attribution methods. Our approach adapts Integrated Gradients for retriever analysis, introduces PMCSHAP, a Monte Carlo-stabilized Shapley Value approximation, for generator attribution, and introduces the Weighted Attribution-Relevance Gap (WARG) metric to measure how well a generator's document usage aligns with a retriever's ranking. Empirical analysis on TREC CAsT and FoodSafeSum reveals critical misalignments: for 47.4% to 66.7% of queries, generators ignore the retriever's top-ranked documents, while 48.1% to 65.9% rely on documents ranked as less relevant. These failure modes demonstrate that RAG output quality depends not solely on individual component performance but on their interplay, which can be audited via RAG-E.
翻译:检索增强生成(RAG)系统结合了密集检索器和语言模型,将大语言模型(LLM)的输出建立在检索到的文档基础上。然而,这些组件交互过程的不透明性给高风险领域的部署带来了挑战。我们提出了RAG-E,一种端到端的可解释性框架,它通过基于数学的归因方法来量化检索器与生成器之间的对齐程度。我们的方法采用积分梯度进行检索器分析,引入PMCSHAP(一种蒙特卡洛稳定的沙普利值近似方法)进行生成器归因,并提出加权归因-相关性差距(WARG)指标来衡量生成器对文档的使用与检索器排序之间的匹配程度。在TREC CAsT和FoodSafeSum数据集上的实证分析揭示了严重的不对齐现象:对于47.4%至66.7%的查询,生成器忽略了检索器排名最高的文档,而48.1%至65.9%的生成结果依赖于相关性排名较低的文档。这些失效模式表明,RAG的输出质量不仅取决于单个组件的性能,更依赖于它们之间的相互作用,而RAG-E可对此进行审计。