Text embeddings enable numerous NLP applications but face severe privacy risks from embedding inversion attacks, which can expose sensitive attributes or reconstruct raw text. Existing differential privacy defenses assume uniform sensitivity across embedding dimensions, leading to excessive noise and degraded utility. We propose SPARSE, a user-centric framework for concept-specific privacy protection in text embeddings. SPARSE combines (1) differentiable mask learning to identify privacy-sensitive dimensions for user-defined concepts, and (2) the Mahalanobis mechanism that applies elliptical noise calibrated by dimension sensitivity. Unlike traditional spherical noise injection, SPARSE selectively perturbs privacy-sensitive dimensions while preserving non-sensitive semantics. Evaluated across six datasets with three embedding models and attack scenarios, SPARSE consistently reduces privacy leakage while achieving superior downstream performance compared to state-of-the-art DP methods.
翻译:文本嵌入技术为众多自然语言处理应用提供了支持,但面临着嵌入反演攻击带来的严重隐私风险,此类攻击可能暴露敏感属性或重构原始文本。现有的差分隐私防御方法假设嵌入维度间的敏感性是均匀的,导致噪声添加过度且效用降低。本文提出SPARSE,一种以用户为中心的、面向文本嵌入中特定概念的隐私保护框架。SPARSE结合了(1)可微分掩码学习,用于识别用户定义概念的隐私敏感维度,以及(2)马氏机制,该机制根据维度敏感性校准并施加椭圆噪声。与传统的球形噪声注入方法不同,SPARSE选择性地扰动隐私敏感维度,同时保留非敏感语义。通过在六个数据集上使用三种嵌入模型和攻击场景进行评估,SPARSE在降低隐私泄露方面表现一致,且与最先进的差分隐私方法相比,实现了更优的下游任务性能。