Retrieval-Augmented Generation (RAG) aims to mitigate hallucinations in large language models (LLMs) by grounding responses in retrieved documents. Yet, RAG-based LLMs still hallucinate even when provided with correct and sufficient context. A growing line of work suggests that this stems from an imbalance between how models use external context and their internal knowledge, and several approaches have attempted to quantify these signals for hallucination detection. However, existing methods require extensive hyperparameter tuning, limiting their generalizability. We propose LUMINA, a novel framework that detects hallucinations in RAG systems through context--knowledge signals: external context utilization is quantified via distributional distance, while internal knowledge utilization is measured by tracking how predicted tokens evolve across transformer layers. We further introduce a framework for statistically validating these measurements. Experiments on common RAG hallucination benchmarks and four open-source LLMs show that LUMINA achieves consistently high AUROC and AUPRC scores, outperforming prior utilization-based methods by up to +13% AUROC on HalluRAG. Moreover, LUMINA remains robust under relaxed assumptions about retrieval quality and model matching, offering both effectiveness and practicality. LUMINA: https://github.com/deeplearning-wisc/LUMINA
翻译:检索增强生成(RAG)旨在通过将大语言模型(LLM)的响应锚定于检索到的文档来缓解其幻觉问题。然而,即使提供了正确且充分的上下文,基于RAG的LLM仍会产生幻觉。一系列研究表明,这源于模型对外部上下文的使用与其内部知识之间的不平衡,已有多种方法尝试量化这些信号以进行幻觉检测。然而,现有方法需要进行大量的超参数调优,限制了其泛化能力。我们提出LUMINA,一种新颖的框架,通过上下文-知识信号检测RAG系统中的幻觉:外部上下文利用率通过分布距离进行量化,而内部知识利用率则通过追踪预测标记在Transformer各层的演化过程来测量。我们进一步引入了一个统计验证这些测量的框架。在常见的RAG幻觉基准测试和四个开源LLM上的实验表明,LUMINA在AUROC和AUPRC指标上均取得了一致的高分,在HalluRAG基准上,其AUROC比先前基于利用率的方法最高提升了+13%。此外,LUMINA在检索质量和模型匹配的宽松假设下仍保持稳健,兼具有效性和实用性。LUMINA项目地址:https://github.com/deeplearning-wisc/LUMINA