Retrieval-Augmented Generation (RAG) systems are usually defined by the combination of a generator and a retrieval component that extracts textual context from a knowledge base to answer user queries. However, such basic implementations exhibit several limitations, including noisy or suboptimal retrieval, misuse of retrieval for out-of-scope queries, weak query-document matching, and variability or cost associated with the generator. These shortcomings have motivated the development of "Enhanced" RAG, where dedicated modules are introduced to address specific weaknesses in the workflow. More recently, the growing self-reflective capabilities of Large Language Models (LLMs) have enabled a new paradigm, often referred to as "Agentic" RAG. In this approach, an LLM orchestrates the entire process, deciding which actions to perform, when to perform them, and whether to iterate. Despite the rapid adoption of both paradigms, it remains unclear which approach is preferable under which conditions. In this work, we conduct an empirically driven evaluation of "Enhanced" and "Agentic" RAG across multiple scenarios and dimensions. Our results provide practical insights into the trade-offs between the two paradigms, offering guidance on selecting the most effective RAG design for real-world applications, considering both performance and costs.
翻译:检索增强生成(RAG)系统通常由生成器和检索组件两部分构成,检索组件从知识库中提取文本上下文以回答用户查询。然而,这类基础实现存在若干局限性,包括噪声或次优检索、对超出范围查询的误用检索、查询-文档匹配薄弱,以及与生成器相关的变异性或成本问题。这些缺陷推动了"增强型"RAG的发展,其中引入专用模块以解决工作流程中的特定弱点。近来,大型语言模型(LLM)不断增强的自我反思能力催生了一种新范式,通常被称为"代理型"RAG。在这种方法中,LLM负责编排整个流程,决定执行哪些操作、何时执行以及是否迭代。尽管两种范式正被快速采用,但在何种条件下哪种方法更优仍不明确。在本工作中,我们基于实证评估,在多个场景和维度上对"增强型"RAG和"代理型"RAG进行了比较。我们的结果为两种范式之间的权衡提供了实用见解,为在实际应用中兼顾性能与成本选择最有效的RAG设计提供了指导。