Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG

Large Language Models (LLMs) have advanced artificial intelligence by enabling human-like text generation and natural language understanding. However, their reliance on static training data limits their ability to respond to dynamic, real-time queries, resulting in outdated or inaccurate outputs. Retrieval-Augmented Generation (RAG) has emerged as a solution, enhancing LLMs by integrating real-time data retrieval to provide contextually relevant and up-to-date responses. Despite its promise, traditional RAG systems are constrained by static workflows and lack the adaptability required for multi-step reasoning and complex task management. Agentic Retrieval-Augmented Generation (Agentic RAG) transcends these limitations by embedding autonomous AI agents into the RAG pipeline. These agents leverage agentic design patterns reflection, planning, tool use, and multi-agent collaboration to dynamically manage retrieval strategies, iteratively refine contextual understanding, and adapt workflows through operational structures ranging from sequential steps to adaptive collaboration. This integration enables Agentic RAG systems to deliver flexibility, scalability, and context-awareness across diverse applications. This paper presents an analytical survey of Agentic RAG systems. It traces the evolution of RAG paradigms, introduces a principled taxonomy of Agentic RAG architectures based on agent cardinality, control structure, autonomy, and knowledge representation, and provides a comparative analysis of design trade-offs across existing frameworks. The survey examines applications in healthcare, finance, education, and enterprise document processing, and distills practical lessons for system designers and practitioners. Finally, it identifies key open research challenges related to evaluation, coordination, memory management, efficiency, and governance, outlining directions for future research.

翻译：大型语言模型（LLMs）通过实现类人文本生成和自然语言理解推动了人工智能的发展。然而，它们对静态训练数据的依赖限制了其应对动态实时查询的能力，导致输出结果过时或不准确。检索增强生成（RAG）作为一种解决方案应运而生，它通过整合实时数据检索来增强LLMs，从而提供上下文相关且最新的响应。尽管具有前景，传统RAG系统受限于静态工作流，缺乏应对多步推理和复杂任务管理所需的适应性。代理式检索增强生成（Agentic RAG）通过将自主AI代理嵌入RAG流水线中，突破了这些局限。这些代理利用反思、规划、工具使用和多代理协作等代理式设计模式，动态管理检索策略，迭代优化上下文理解，并通过从顺序步骤到自适应协作的操作结构调整工作流。这种集成使Agentic RAG系统能够在多样化应用中实现灵活性、可扩展性和上下文感知能力。本文对Agentic RAG系统进行了分析性综述。它梳理了RAG范式的演变过程，基于代理基数、控制结构、自主性和知识表示引入了Agentic RAG架构的原则性分类法，并对现有框架的设计权衡进行了比较分析。该综述考察了其在医疗、金融、教育及企业文档处理中的应用，并为系统设计者和实践者提炼了实用经验。最后，本文指出了与评估、协调、内存管理、效率和治理相关的关键开放研究挑战，并概述了未来研究方向。