Text embedding inversion attacks reconstruct original sentences from latent representations, posing severe privacy threats in collaborative inference and edge computing. We propose TextCrafter, an optimization-based adversarial perturbation mechanism that combines RL learned, geometry aware noise injection orthogonal to user embeddings with cluster priors and PII signal guidance to suppress inversion while preserving task utility. Unlike prior defenses either non learnable or agnostic to perturbation direction, TextCrafter provides a directional protective policy that balances privacy and utility. Under strong privacy setting, TextCrafter maintains 70 percentage classification accuracy on four datasets and consistently outperforms Gaussian/LDP baselines across lower privacy budgets, demonstrating a superior privacy utility trade off.
翻译:文本嵌入反演攻击能够从潜在表示中重构原始语句,对协同推理与边缘计算场景构成严重隐私威胁。本文提出TextCrafter——一种基于优化的对抗扰动机制,该方法融合强化学习驱动的几何感知噪声注入(与用户嵌入正交)、聚类先验以及PII信号引导,在抑制反演的同时保持任务效用。相较于现有不可学习或对扰动方向无感知的防御方案,TextCrafter提供了一种能平衡隐私与效用的方向性保护策略。在强隐私设定下,TextCrafter在四个数据集上保持70%的分类准确率,并在较低隐私预算条件下持续优于高斯/LDP基线方法,展现出更优的隐私-效用权衡特性。