Retrieval-Augmented Generation (RAG) systems enhance large language models (LLMs) by incorporating external knowledge bases, but they are vulnerable to privacy risks from data extraction attacks. Existing extraction methods typically rely on malicious inputs such as prompt injection or jailbreaking, making them easily detectable via input- or output-level detection. In this paper, we introduce Implicit Knowledge Extraction Attack (IKEA), which conducts knowledge extraction on RAG systems through benign queries. IKEA first leverages anchor concepts to generate queries with the natural appearance, and then designs two mechanisms to lead to anchor concept thoroughly 'explore' the RAG's privacy knowledge: (1) Experience Reflection Sampling, which samples anchor concepts based on past query-response patterns to ensure the queries' relevance to RAG documents; (2) Trust Region Directed Mutation, which iteratively mutates anchor concepts under similarity constraints to further exploit the embedding space. Extensive experiments demonstrate IKEA's effectiveness under various defenses, surpassing baselines by over 80% in extraction efficiency and 90% in attack success rate. Moreover, the substitute RAG system built from IKEA's extractions consistently outperforms those based on baseline methods across multiple evaluation tasks, underscoring the significant privacy risk in RAG systems.
翻译:检索增强生成(RAG)系统通过整合外部知识库增强了大语言模型(LLM)的能力,但其面临数据抽取攻击带来的隐私风险。现有抽取方法通常依赖提示注入或越狱等恶意输入,使其易于通过输入或输出层面的检测被识别。本文提出隐式知识抽取攻击(IKEA),该方法通过良性查询对RAG系统实施知识抽取。IKEA首先利用锚点概念生成具有自然外观的查询,随后设计两种机制引导锚点概念对RAG隐私知识进行彻底“探索”:(1)经验反射采样:基于历史查询-响应模式对锚点概念进行采样,确保查询与RAG文档的相关性;(2)信任区域定向变异:在相似性约束下对锚点概念进行迭代变异,以进一步挖掘嵌入空间。大量实验表明,IKEA在多种防御机制下均保持有效性,其抽取效率较基线方法提升超过80%,攻击成功率提升超过90%。此外,基于IKEA抽取结果构建的替代RAG系统在多项评估任务中持续优于基于基线方法的系统,这凸显了RAG系统面临的重大隐私风险。