RAG has emerged as a key technique for enhancing response quality of LLMs without high computational cost. In traditional architectures, RAG services are provided by a single entity that hosts the dataset within a trusted local environment. However, individuals or small organizations often lack the resources to maintain data storage servers, leading them to rely on outsourced cloud storage. This dependence on untrusted third-party services introduces privacy risks. Embedding-based retrieval mechanisms, commonly used in RAG systems, are vulnerable to privacy leakage such as vector-to-text reconstruction attacks and structural leakage via vector analysis. Several privacy-preserving RAG techniques have been proposed but most existing approaches rely on partially homomorphic encryption, which incurs substantial computational overhead. To address these challenges, we propose an efficient privacy-preserving RAG framework (ppRAG) tailored for untrusted cloud environments that defends against vector-to-text attack, vector analysis, and query analysis. We propose Conditional Approximate Distance-Comparison-Preserving Symmetric Encryption (CAPRISE) that encrypts embeddings while still allowing the cloud to compute similarity between an encrypted query and the encrypted database embeddings. CAPRISE preserves only the relative distance ordering between the encrypted query and each encrypted database embedding, without exposing inter-database distances, thereby enhancing both privacy and efficiency. To mitigate query analysis, we introduce DP by perturbing the query embedding prior to encryption, preventing the cloud from inferring sensitive patterns. Experimental results show that ppRAG achieves efficient processing throughput, high retrieval accuracy, strong privacy guarantees, making it a practical solution for resource-constrained users seeking secure cloud-augmented LLMs.
翻译:检索增强生成(RAG)已成为一种无需高昂计算成本即可提升大语言模型响应质量的关键技术。在传统架构中,RAG服务通常由单一实体在可信的本地环境中托管数据集提供。然而,个人或小型组织往往缺乏维护数据存储服务器的资源,导致他们依赖外包的云存储服务。这种对不可信第三方服务的依赖带来了隐私风险。RAG系统中常用的基于嵌入的检索机制容易受到隐私泄露攻击,例如向量到文本的重构攻击以及通过向量分析导致的结构性泄露。已有多种隐私保护RAG技术被提出,但现有方法大多依赖于部分同态加密,这会带来巨大的计算开销。为应对这些挑战,我们提出了一种专为不可信云环境设计的高效隐私保护RAG框架(ppRAG),能够防御向量到文本攻击、向量分析和查询分析。我们提出了条件近似距离比较保持对称加密(CAPRISE),该方案对嵌入向量进行加密,同时仍允许云服务器计算加密查询与加密数据库嵌入向量之间的相似度。CAPRISE仅保持加密查询与每个加密数据库嵌入向量之间的相对距离顺序,而不暴露数据库内部向量间的距离,从而同时提升了隐私性和效率。为缓解查询分析风险,我们通过在加密前对查询嵌入向量添加扰动来引入差分隐私,防止云服务器推断敏感模式。实验结果表明,ppRAG实现了高效的处理吞吐量、高检索精度和强隐私保障,为资源受限用户寻求安全的云增强大语言模型提供了一个实用解决方案。