Retrieval-Augmented Generation (RAG) effectively improves the accuracy of Large Language Models (LLMs). However, retrieval noises significantly undermine the quality of LLMs' generation, necessitating the development of denoising mechanisms. Previous works extract evidence straightforwardly without deep thinking, which may risk filtering out key clues and struggle with generalization. To this end, we propose EviOmni, which learns to extract rational evidence via reasoning first and then extracting. Specifically, EviOmni integrates evidence reasoning and evidence extraction into one unified trajectory, followed by knowledge token masking to avoid information leakage, optimized via on-policy reinforcement learning with verifiable rewards in terms of answer, length, and format. Extensive experiments on five benchmark datasets show the superiority of EviOmni, which provides compact and high-quality evidence, enhances the accuracy of downstream tasks, and supports both traditional and agentic RAG systems.
翻译:检索增强生成(RAG)能有效提升大语言模型(LLM)的准确性。然而,检索噪声会显著降低LLM生成内容的质量,因此需要开发去噪机制。现有方法通常直接提取证据而缺乏深度推理,这可能导致关键线索被过滤且泛化能力受限。为此,我们提出EviOmni方法,通过学习“先推理后提取”的机制来获取理性证据。具体而言,EviOmni将证据推理与证据提取整合至统一轨迹中,并通过知识令牌掩码避免信息泄露,最终采用基于可验证奖励(涵盖答案准确性、长度与格式)的同策略强化学习进行优化。在五个基准数据集上的大量实验表明,EviOmni能够提供紧凑且高质量的证据,提升下游任务准确率,并兼容传统与智能体式RAG系统。