Long-Context Question Answering (LCQA), a challenging task, aims to reason over long-context documents to yield accurate answers to questions. Existing long-context Large Language Models (LLMs) for LCQA often struggle with the "lost in the middle" issue. Retrieval-Augmented Generation (RAG) mitigates this issue by providing external factual evidence. However, its chunking strategy disrupts the global long-context information, and its low-quality retrieval in long contexts hinders LLMs from identifying effective factual details due to substantial noise. To this end, we propose LongRAG, a general, dual-perspective, and robust LLM-based RAG system paradigm for LCQA to enhance RAG's understanding of complex long-context knowledge (i.e., global information and factual details). We design LongRAG as a plug-and-play paradigm, facilitating adaptation to various domains and LLMs. Extensive experiments on three multi-hop datasets demonstrate that LongRAG significantly outperforms long-context LLMs (up by 6.94%), advanced RAG (up by 6.16%), and Vanilla RAG (up by 17.25%). Furthermore, we conduct quantitative ablation studies and multi-dimensional analyses, highlighting the effectiveness of the system's components and fine-tuning strategies. Data and code are available at https://github.com/QingFei1/LongRAG.
翻译:长上下文问答(LCQA)是一项具有挑战性的任务,旨在对长上下文文档进行推理以生成问题的准确答案。现有的用于LCQA的长上下文大语言模型(LLM)常受困于“迷失在中间”的问题。检索增强生成(RAG)通过提供外部事实证据来缓解此问题。然而,其分块策略破坏了全局的长上下文信息,且在长上下文中的低质量检索因存在大量噪声而阻碍了LLM识别有效的事实细节。为此,我们提出了LongRAG,一个通用的、双视角的、鲁棒的基于LLM的RAG系统范式,用于增强RAG对复杂长上下文知识(即全局信息和事实细节)的理解。我们将LongRAG设计为一个即插即用的范式,便于适应不同领域和LLM。在三个多跳数据集上的大量实验表明,LongRAG显著优于长上下文LLM(最高提升6.94%)、先进RAG(最高提升6.16%)和Vanilla RAG(最高提升17.25%)。此外,我们进行了定量的消融研究和多维度分析,突出了系统各组件及微调策略的有效性。数据和代码可在 https://github.com/QingFei1/LongRAG 获取。