Retrieval-Augmented Generation (RAG) significantly mitigates the hallucinations and domain knowledge deficiency in large language models by incorporating external knowledge bases. However, the multi-module architecture of RAG introduces complex system-level security vulnerabilities. Guided by the RAG workflow, this paper analyzes the underlying vulnerability mechanisms and systematically categorizes core threat vectors such as data poisoning, adversarial attacks, and membership inference attacks. Based on this threat assessment, we construct a taxonomy of RAG defense technologies from a dual perspective encompassing both input and output stages. The input-side analysis reviews data protection mechanisms including dynamic access control, homomorphic encryption retrieval, and adversarial pre-filtering. The output-side examination summarizes advanced leakage prevention techniques such as federated learning isolation, differential privacy perturbation, and lightweight data sanitization. To establish a unified benchmark for future experimental design, we consolidate authoritative test datasets, security standards, and evaluation frameworks. To the best of our knowledge, this paper presents the first end-to-end survey dedicated to the security of RAG systems. Distinct from existing literature that isolates specific vulnerabilities, we systematically map the entire pipeline-providing a unified analysis of threat models, defense mechanisms, and evaluation benchmarks. By enabling deep insights into potential risks, this work seeks to foster the development of highly robust and trustworthy next-generation RAG systems.
翻译:检索增强生成通过引入外部知识库,显著缓解了大语言模型中的幻觉现象和领域知识缺失问题。然而,RAG的多模块架构引入了复杂的系统级安全漏洞。本文以RAG工作流程为导向,分析了底层漏洞机理,并系统性地归纳了数据投毒、对抗攻击、成员推断攻击等核心威胁向量。基于此威胁评估,我们从输入和输出双重视角构建了RAG防御技术的分类体系。输入侧分析综述了包括动态访问控制、同态加密检索和对抗预过滤在内的数据保护机制;输出侧总结归纳了联邦学习隔离、差分隐私扰动及轻量级数据清洗等先进的泄漏预防技术。为建立未来实验设计的统一基准,我们整合了权威测试数据集、安全标准和评估框架。据我们所知,本文是首篇针对RAG系统安全性的端到端综述。与现有文献孤立研究特定漏洞不同,我们系统性地映射了整个管线——提供了对威胁模型、防御机制和评估基准的统一分析。通过深入洞察潜在风险,本工作旨在促进高度鲁棒且可信的下一代RAG系统的发展。