Stable-RAG：缓解检索增强生成中因检索排列顺序引发的幻觉问题 (Stable-RAG: Mitigating Retrieval-Permutation-Induced Hallucinations in Retrieval-Augmented Generation) - 专知论文

会员服务 ·

0

排列 · 敏感性 · 鲁棒 · 检索增强 · 一致 ·

Stable-RAG: Mitigating Retrieval-Permutation-Induced Hallucinations in Retrieval-Augmented Generation

翻译：Stable-RAG：缓解检索增强生成中因检索排列顺序引发的幻觉问题

Qianchi Zhang,Hainan Zhang,Liang Pang,Hongwei Zheng,Zhiming Zheng

from arxiv, 19 pages, 13figures, 8 tables, under review

Retrieval-Augmented Generation (RAG) has become a key paradigm for reducing factual hallucinations in large language models (LLMs), yet little is known about how the order of retrieved documents affects model behavior. We empirically show that under Top-5 retrieval with the gold document included, LLM answers vary substantially across permutations of the retrieved set, even when the gold document is fixed in the first position. This reveals a previously underexplored sensitivity to retrieval permutations. Although robust RAG methods primarily focus on enhancing LLM robustness to low-quality retrieval and mitigating positional bias to distribute attention fairly over long contexts, neither approach directly addresses permutation sensitivity. In this paper, we propose Stable-RAG, which exploits permutation sensitivity estimation to mitigate permutation-induced hallucinations. Stable-RAG runs the generator under multiple retrieval orders, clusters hidden states, and decodes from a cluster-center representation that captures the dominant reasoning pattern. It then uses these reasoning results to align hallucinated outputs toward the correct answer, encouraging the model to produce consistent and accurate predictions across document permutations. Experiments on three QA datasets show that Stable-RAG significantly improves answer accuracy, reasoning consistency and robust generalization across datasets, retrievers, and input lengths compared with baselines.

翻译：检索增强生成（RAG）已成为减少大语言模型事实性幻觉的关键范式，然而检索文档的顺序如何影响模型行为尚不明确。我们通过实验证明，在包含黄金文档的Top-5检索条件下，即使黄金文档固定于首位，大语言模型的答案仍会随检索集合的排列顺序发生显著变化。这揭示了一种先前未被充分探索的检索排列敏感性。尽管现有鲁棒性RAG方法主要致力于增强大语言模型对低质量检索的鲁棒性，并通过缓解位置偏差来公平分配长上下文注意力，但两种方法均未直接解决排列敏感性问题。本文提出Stable-RAG方法，利用排列敏感性估计来缓解因排列顺序引发的幻觉。Stable-RAG通过在多种检索顺序下运行生成器，对隐藏状态进行聚类，并从捕获主导推理模式的聚类中心表征进行解码。随后利用这些推理结果将幻觉输出对齐至正确答案，促使模型在不同文档排列下产生一致且准确的预测。在三个问答数据集上的实验表明，相较于基线方法，Stable-RAG在答案准确性、推理一致性及跨数据集、检索器和输入长度的鲁棒泛化能力方面均有显著提升。

0

相关内容

【AAAI2026】TruthfulRAG：基于知识图谱解决检索增强生成中的事实层冲突

【AAAI2026】TruthfulRAG：基于知识图谱解决检索增强生成中的事实层冲突

专知会员服务

21+阅读 · 2025年11月15日

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

专知会员服务

24+阅读 · 2025年10月29日

检索增强生成（RAG）技术，261页slides

检索增强生成（RAG）技术，261页slides

专知会员服务

42+阅读 · 2025年10月16日

【新书】Essential GraphRAG: 知识图谱增强的RAG

【新书】Essential GraphRAG: 知识图谱增强的RAG

专知会员服务

33+阅读 · 2025年7月17日

【新书】检索增强生成（RAG）入门指南

【新书】检索增强生成（RAG）入门指南

专知会员服务

30+阅读 · 2025年6月25日

检索增强生成(RAG)与推理的协同作用：一项系统综述

检索增强生成(RAG)与推理的协同作用：一项系统综述

专知会员服务

34+阅读 · 2025年4月27日

图增强生成（GraphRAG）

图增强生成（GraphRAG）

专知会员服务

35+阅读 · 2025年1月4日

检索增强生成系统中的可信度：综述

检索增强生成系统中的可信度：综述

专知会员服务

44+阅读 · 2024年9月18日

图怎么用RAG？北大等最新《图检索增强生成(GraphRAG)》综述

图怎么用RAG？北大等最新《图检索增强生成(GraphRAG)》综述

专知会员服务

55+阅读 · 2024年8月22日

RAG+LLM=？同济大学等最新《大型语言模型的检索增强生成》综述

RAG+LLM=？同济大学等最新《大型语言模型的检索增强生成》综述

专知会员服务

111+阅读 · 2023年12月19日

滑铁卢大学2020新书《预训练Transformer模型文本排序》，155页pdf

滑铁卢大学2020新书《预训练Transformer模型文本排序》，155页pdf

专知

10+阅读 · 2020年10月19日

高效的文本生成方法 — LaserTagger 现已开源

高效的文本生成方法 — LaserTagger 现已开源

TensorFlow

30+阅读 · 2020年2月27日

成熟的目标检测，也该自己学习数据增强策略达到SOTA了

成熟的目标检测，也该自己学习数据增强策略达到SOTA了

机器之心

17+阅读 · 2019年6月28日

最新最权威《深度学习显著目标检测综述》论文代码数据发布，带你全面了解显著目标检测方法

最新最权威《深度学习显著目标检测综述》论文代码数据发布，带你全面了解显著目标检测方法

专知

79+阅读 · 2019年4月24日

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

AI100

33+阅读 · 2019年3月16日

博客 | 总结+paper分享|对话系统中的自然语言生成技术（NLG）

博客 | 总结+paper分享|对话系统中的自然语言生成技术（NLG）

AI研习社

16+阅读 · 2018年12月4日

NLG ≠ 机器写作 | 专家专栏

NLG ≠ 机器写作 | 专家专栏

量子位

13+阅读 · 2018年9月10日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【干货】基于注意力机制的神经匹配模型用于短文本检索

【干货】基于注意力机制的神经匹配模型用于短文本检索

专知

11+阅读 · 2018年1月11日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

RACK7选择性结合活跃增强子的分子机制及生物学意义的研究

国家自然科学基金

0+阅读 · 2016年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

组合测试用例优先排序算法及选择策略研究

国家自然科学基金

9+阅读 · 2015年12月31日

TSHR易感位点导致Graves病人群TRAb持续阳性的机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

基于非独立同分布学习理论的图模型词义消歧及领域适应方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于形态和多词的有限语料蒙汉互译调序优化方法

国家自然科学基金

0+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

图像复原中非凸稀疏优化问题的快速算法

国家自然科学基金

0+阅读 · 2015年12月31日

随机文法作为通用统计模型的扩展

国家自然科学基金

1+阅读 · 2015年12月31日

复杂生产制造环境下的排序问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces

Arxiv

0+阅读 · 2月3日

Bounding Hallucinations: Information-Theoretic Guarantees for RAG Systems via Merlin-Arthur Protocols

Arxiv

0+阅读 · 1月30日

Predicting Retrieval Utility and Answer Quality in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月20日

RAGExplorer: A Visual Analytics System for the Comparative Diagnosis of RAG Systems

Arxiv

0+阅读 · 1月19日

SD-RAG: A Prompt-Injection-Resilient Framework for Selective Disclosure in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月16日

PruneRAG: Confidence-Guided Query Decomposition Trees for Efficient Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月16日

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月15日

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月10日

Stable-RAG: Mitigating Retrieval-Permutation-Induced Hallucinations in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月9日

Stable-RAG: Mitigating Retrieval-Permutation-Induced Hallucinations in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月7日

VIP会员

文章信息

相关主题

相关VIP内容

【AAAI2026】TruthfulRAG：基于知识图谱解决检索增强生成中的事实层冲突

【AAAI2026】TruthfulRAG：基于知识图谱解决检索增强生成中的事实层冲突

专知会员服务

21+阅读 · 2025年11月15日

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

专知会员服务

24+阅读 · 2025年10月29日

检索增强生成（RAG）技术，261页slides

检索增强生成（RAG）技术，261页slides

专知会员服务

42+阅读 · 2025年10月16日

【新书】Essential GraphRAG: 知识图谱增强的RAG

【新书】Essential GraphRAG: 知识图谱增强的RAG

专知会员服务

33+阅读 · 2025年7月17日

【新书】检索增强生成（RAG）入门指南

【新书】检索增强生成（RAG）入门指南

专知会员服务

30+阅读 · 2025年6月25日

检索增强生成(RAG)与推理的协同作用：一项系统综述

检索增强生成(RAG)与推理的协同作用：一项系统综述

专知会员服务

34+阅读 · 2025年4月27日

图增强生成（GraphRAG）

图增强生成（GraphRAG）

专知会员服务

35+阅读 · 2025年1月4日

检索增强生成系统中的可信度：综述

检索增强生成系统中的可信度：综述

专知会员服务

44+阅读 · 2024年9月18日

图怎么用RAG？北大等最新《图检索增强生成(GraphRAG)》综述

图怎么用RAG？北大等最新《图检索增强生成(GraphRAG)》综述

专知会员服务

55+阅读 · 2024年8月22日

RAG+LLM=？同济大学等最新《大型语言模型的检索增强生成》综述

RAG+LLM=？同济大学等最新《大型语言模型的检索增强生成》综述

专知会员服务

111+阅读 · 2023年12月19日

热门VIP内容

开通专知VIP会员享更多权益服务

论学习、公平性与复杂度

《整合杀伤链：一个用于边缘目标验证与战术推理的零样本框架》最新资料

2025中国人工智能学会系列白皮书⸺棋盘上的人工智能|附下载

通用智能体评估的逻辑架构

相关资讯

滑铁卢大学2020新书《预训练Transformer模型文本排序》，155页pdf

滑铁卢大学2020新书《预训练Transformer模型文本排序》，155页pdf

专知

10+阅读 · 2020年10月19日

高效的文本生成方法 — LaserTagger 现已开源

高效的文本生成方法 — LaserTagger 现已开源

TensorFlow

30+阅读 · 2020年2月27日

成熟的目标检测，也该自己学习数据增强策略达到SOTA了

成熟的目标检测，也该自己学习数据增强策略达到SOTA了

机器之心

17+阅读 · 2019年6月28日

最新最权威《深度学习显著目标检测综述》论文代码数据发布，带你全面了解显著目标检测方法

最新最权威《深度学习显著目标检测综述》论文代码数据发布，带你全面了解显著目标检测方法

专知

79+阅读 · 2019年4月24日

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

百度提出ERNIE，多项中文NLP任务表现出色（已开源）

AI100

33+阅读 · 2019年3月16日

博客 | 总结+paper分享|对话系统中的自然语言生成技术（NLG）

博客 | 总结+paper分享|对话系统中的自然语言生成技术（NLG）

AI研习社

16+阅读 · 2018年12月4日

NLG ≠ 机器写作 | 专家专栏

NLG ≠ 机器写作 | 专家专栏

量子位

13+阅读 · 2018年9月10日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

【干货】基于注意力机制的神经匹配模型用于短文本检索

【干货】基于注意力机制的神经匹配模型用于短文本检索

专知

11+阅读 · 2018年1月11日

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

【论文】所见所想所真，对抗学习GAN提升跨模态检索效果！阿里巴巴AI Labs等团队最新工作

专知

12+阅读 · 2017年12月21日

相关论文

A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces

Arxiv

0+阅读 · 2月3日

Bounding Hallucinations: Information-Theoretic Guarantees for RAG Systems via Merlin-Arthur Protocols

Arxiv

0+阅读 · 1月30日

Predicting Retrieval Utility and Answer Quality in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月20日

RAGExplorer: A Visual Analytics System for the Comparative Diagnosis of RAG Systems

Arxiv

0+阅读 · 1月19日

SD-RAG: A Prompt-Injection-Resilient Framework for Selective Disclosure in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月16日

PruneRAG: Confidence-Guided Query Decomposition Trees for Efficient Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月16日

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月15日

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月10日

Stable-RAG: Mitigating Retrieval-Permutation-Induced Hallucinations in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月9日

Stable-RAG: Mitigating Retrieval-Permutation-Induced Hallucinations in Retrieval-Augmented Generation

Arxiv

0+阅读 · 1月7日

相关基金

RACK7选择性结合活跃增强子的分子机制及生物学意义的研究

国家自然科学基金

0+阅读 · 2016年12月31日

针对大规模环境下复杂任务的策略搜索强化学习方法研究

国家自然科学基金

43+阅读 · 2015年12月31日

组合测试用例优先排序算法及选择策略研究

国家自然科学基金

9+阅读 · 2015年12月31日

TSHR易感位点导致Graves病人群TRAb持续阳性的机制探讨

国家自然科学基金

0+阅读 · 2015年12月31日

基于非独立同分布学习理论的图模型词义消歧及领域适应方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于形态和多词的有限语料蒙汉互译调序优化方法

国家自然科学基金

0+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

图像复原中非凸稀疏优化问题的快速算法

国家自然科学基金

0+阅读 · 2015年12月31日

随机文法作为通用统计模型的扩展

国家自然科学基金

1+阅读 · 2015年12月31日

复杂生产制造环境下的排序问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员