The nonparametric variational information bottleneck (NVIB) provides the foundation for nonparametric variational differential privacy (NVDP), a framework for building privacy-preserving language models. However, the learned latent representations can drift into regions with high information content, leading to poor privacy guarantees, but also low utility due to numerical instability during training. In this work, we introduce a principled parameter clipping strategy to directly address this issue. Our method is mathematically derived from the objective of minimizing the Rényi Divergence (RD) upper bound, yielding specific, theoretically grounded constraints on the posterior mean, variance, and mixture weight parameters. We apply our technique to an NVIB based model and empirically compare it against an unconstrained baseline. Our findings demonstrate that the clipped model consistently achieves tighter RD bounds, implying stronger privacy, while simultaneously attaining higher performance on several downstream tasks. This work presents a simple yet effective method for improving the privacy-utility trade-off in variational models, making them more robust and practical.
翻译:非参数变分信息瓶颈(NVIB)为非参数变分差分隐私(NVDP)提供了理论基础,后者是构建隐私保护语言模型的框架。然而,学习到的潜在表示可能漂移至信息含量较高的区域,这不仅导致隐私保障效果不佳,还会因训练过程中的数值不稳定性而造成效用低下。本文提出一种基于原理的参数裁剪策略,以直接解决这一问题。我们的方法从最小化Rényi散度(RD)上界的目标出发进行数学推导,从而对后验均值、方差及混合权重参数施加具体且理论依据明确的约束。我们将该技术应用于基于NVIB的模型,并经验性地与无约束基线进行比较。实验结果表明,经过裁剪的模型能够持续获得更紧致的RD上界,这意味着更强的隐私保护能力,同时在多个下游任务上取得更高的性能。本研究提出了一种简单而有效的方法,用于改善变分模型中的隐私-效用权衡,从而使其更具鲁棒性和实用性。