Differentially Private Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) is a widely used framework for reducing hallucinations in large language models (LLMs) on domain-specific tasks by retrieving relevant documents from a database to support accurate responses. However, when the database contains sensitive corpora, such as medical records or legal documents, RAG poses serious privacy risks by potentially exposing private information through its outputs. Prior work has demonstrated that one can practically craft adversarial prompts that force an LLM to regurgitate the augmented contexts. A promising direction is to integrate differential privacy (DP), a privacy notion that offers strong formal guarantees, into RAG systems. However, naively applying DP mechanisms into existing systems often leads to significant utility degradation. Particularly for RAG systems, DP can reduce the usefulness of the augmented contexts leading to increase risk of hallucination from the LLMs. Motivated by these challenges, we present DP-KSA, a novel privacy-preserving RAG algorithm that integrates DP using the propose-test-release paradigm. DP-KSA follows from a key observation that most question-answering (QA) queries can be sufficiently answered with a few keywords. Hence, DP-KSA first obtains an ensemble of relevant contexts, each of which will be used to generate a response from an LLM. We utilize these responses to obtain the most frequent keywords in a differentially private manner. Lastly, the keywords are augmented into the prompt for the final output. This approach effectively compresses the semantic space while preserving both utility and privacy. We formally show that DP-KSA provides formal DP guarantees on the generated output with respect to the RAG database. We evaluate DP-KSA on two QA benchmarks using three instruction-tuned LLMs, and our empirical results demonstrate that DP-KSA achieves a strong privacy-utility tradeoff.

翻译：检索增强生成（RAG）是一种广泛应用的框架，旨在通过从数据库中检索相关文档以支持生成准确回答，从而减少大语言模型（LLM）在特定领域任务中的幻觉现象。然而，当数据库包含敏感语料（如医疗记录或法律文件）时，RAG 会通过其输出可能暴露私人信息，从而带来严重的隐私风险。已有研究表明，攻击者可以实际构造对抗性提示，迫使 LLM 泄露增强的上下文内容。一个具有前景的方向是将差分隐私（DP）——一种提供严格形式化保障的隐私概念——集成到 RAG 系统中。然而，将 DP 机制简单应用于现有系统通常会导致实用性显著下降。特别是对于 RAG 系统，DP 可能降低增强上下文的可用性，从而增加 LLM 产生幻觉的风险。基于这些挑战，我们提出了 DP-KSA，一种新颖的隐私保护 RAG 算法，该算法采用“提议-测试-发布”范式集成差分隐私。DP-KSA 源于一个关键观察：大多数问答（QA）查询仅需少量关键词即可充分回答。因此，DP-KSA 首先获取一组相关上下文，每个上下文将用于从 LLM 生成一个回答。我们利用这些回答，以差分隐私的方式获取最频繁出现的关键词。最后，将这些关键词增强到提示中以生成最终输出。该方法在保留实用性与隐私性的同时，有效压缩了语义空间。我们形式化地证明了 DP-KSA 对生成的输出相对于 RAG 数据库提供了形式化的 DP 保障。我们在两个 QA 基准测试中使用三种指令调优的 LLM 评估 DP-KSA，实证结果表明 DP-KSA 实现了优异的隐私-效用权衡。