Retrieval Augmented Generation (RAG) systems have shown great promise in natural language processing. However, their reliance on data stored in a retrieval database, which may contain proprietary or sensitive information, introduces new privacy concerns. Specifically, an attacker may be able to infer whether a certain text passage appears in the retrieval database by observing the outputs of the RAG system, an attack known as a Membership Inference Attack (MIA). Despite the significance of this threat, MIAs against RAG systems have yet remained under-explored. This study addresses this gap by introducing an efficient and easy-to-use method for conducting MIA against RAG systems. We demonstrate the effectiveness of our attack using two benchmark datasets and multiple generative models, showing that the membership of a document in the retrieval database can be efficiently determined through the creation of an appropriate prompt in both black-box and gray-box settings. Our findings highlight the importance of implementing security countermeasures in deployed RAG systems to protect the privacy and security of retrieval databases.
翻译:检索增强生成(RAG)系统在自然语言处理领域展现出巨大潜力。然而,其依赖于存储在检索数据库中的数据——这些数据可能包含专有或敏感信息——引入了新的隐私隐患。具体而言,攻击者可能通过观察RAG系统的输出,推断特定文本段落是否出现在检索数据库中,此类攻击被称为成员推断攻击(MIA)。尽管此威胁意义重大,针对RAG系统的MIA研究仍处于探索不足的状态。本研究通过提出一种高效且易用的方法对RAG系统实施MIA,以填补这一空白。我们利用两个基准数据集和多种生成模型验证了攻击的有效性,结果表明:通过在黑盒与灰盒场景下构建恰当的提示,可以高效判定文档是否属于检索数据库的成员。我们的发现强调了在已部署的RAG系统中实施安全防护措施的重要性,以保障检索数据库的隐私与安全。