Despite significant advancements, large language models (LLMs) still struggle with providing accurate answers when lacking domain-specific or up-to-date knowledge. Retrieval-Augmented Generation (RAG) addresses this limitation by incorporating external knowledge bases, but it also introduces new attack surfaces. In this paper, we investigate data extraction attacks targeting the knowledge databases of RAG systems. We demonstrate that previous attacks on RAG largely depend on the instruction-following capabilities of LLMs, and that simple fine-tuning can reduce the success rate of such attacks to nearly zero. This makes these attacks impractical since fine-tuning is a common practice when deploying LLMs in specific domains. To further reveal the vulnerability, we propose to backdoor RAG, where a small portion of poisoned data is injected during the fine-tuning phase to create a backdoor within the LLM. When this compromised LLM is integrated into a RAG system, attackers can exploit specific triggers in prompts to manipulate the LLM to leak documents from the retrieval database. By carefully designing the poisoned data, we achieve both verbatim and paraphrased document extraction. We show that with only 3\% poisoned data, our method achieves an average success rate of 79.7\% in verbatim extraction on Llama2-7B, with a ROUGE-L score of 64.21, and a 68.6\% average success rate in paraphrased extraction, with an average ROUGE score of 52.6 across four datasets. These results underscore the privacy risks associated with the supply chain when deploying RAG systems.
翻译:尽管取得了显著进展,大型语言模型(LLMs)在缺乏领域特定知识或最新信息时仍难以提供准确答案。检索增强生成(RAG)通过引入外部知识库来解决这一局限,但同时也带来了新的攻击面。本文研究了针对RAG系统知识库的数据提取攻击。我们证明,先前对RAG的攻击主要依赖于LLMs的指令遵循能力,而简单的微调即可将此类攻击成功率降至近乎零。这使得这些攻击在实际中难以奏效,因为在特定领域部署LLMs时微调是常见做法。为深入揭示其脆弱性,我们提出对RAG进行后门植入,即在微调阶段注入少量污染数据,从而在LLM内部创建后门。当该受感染的LLM被集成到RAG系统中时,攻击者可通过提示中的特定触发器操控LLM,使其泄露检索数据库中的文档。通过精心设计污染数据,我们实现了逐字提取和释义提取两种文档窃取方式。实验表明,仅使用3%的污染数据,我们的方法在Llama2-7B模型上实现了平均79.7%的逐字提取成功率(ROUGE-L得分64.21),以及平均68.6%的释义提取成功率(在四个数据集上平均ROUGE得分52.6)。这些结果凸显了部署RAG系统时供应链环节存在的隐私风险。