Retrieval-Augmented Generation (RAG) effectively improves the accuracy of Large Language Models (LLMs). However, retrieval noises significantly undermine the quality of LLMs' generation, necessitating the development of denoising mechanisms. Previous works extract evidence straightforwardly without deep thinking, which may risk filtering out key clues and struggle with generalization. To this end, we propose EviOmni, which learns to extract rational evidence via reasoning first and then extracting. Specifically, EviOmni integrates evidence reasoning and evidence extraction into one unified trajectory, followed by knowledge token masking to avoid information leakage, optimized via on-policy reinforcement learning with verifiable rewards in terms of answer, length, and format. Extensive experiments on five benchmark datasets show the superiority of EviOmni, which provides compact and high-quality evidence, enhances the accuracy of downstream tasks, and supports both traditional and agentic RAG systems.
翻译:检索增强生成(RAG)有效提升了大型语言模型(LLMs)的准确性。然而,检索噪声严重削弱了LLMs的生成质量,亟需开发去噪机制。现有方法在提取证据时缺乏深层思考,这可能导致关键线索被滤除,且泛化能力不足。为此,我们提出EviOmni,通过先推理后提取的方式,学习提取理性证据。具体而言,EviOmni将证据推理与证据提取整合为统一轨迹,随后采用知识令牌掩码避免信息泄漏,并通过基于答案、长度和格式的可验证奖励进行在线强化学习优化。在五个基准数据集上的大量实验表明,EviOmni具有优越性:能提供精简且高质量的证据,提升下游任务准确性,并支持传统RAG系统与智能体RAG系统。