Local differential privacy (LDP) is a powerful method for privacy-preserving data collection. In this paper, we develop a framework for training Generative Adversarial Networks (GAN) on differentially privatized data. We show that entropic regularization of the Wasserstein distance -- a popular regularization method in the literature that has been often leveraged for its computational benefits -- can be used to denoise the data distribution when data is privatized by common additive noise mechanisms, such as Laplace and Gaussian. This combination uniquely enables the mitigation of both the regularization bias and the effects of privatization noise, thereby enhancing the overall efficacy of the model. We analyse the proposed method, provide sample complexity results and experimental evidence to support its efficacy.
翻译:局部差分隐私(LDP)是一种用于隐私保护数据采集的强大方法。本文提出一个框架,用于在差分私有化数据上训练生成对抗网络(GAN)。我们证明,Wasserstein距离的熵正则化——文献中因其计算优势而常被采用的一种流行正则化方法——可用于在数据被常见加性噪声机制(如拉普拉斯和高斯噪声)私有化时,对数据分布进行去噪。这种独特的组合能够同时缓解正则化偏差和私有化噪声的影响,从而提升模型整体效能。我们对所提出方法进行了分析,提供了样本复杂度结果和实验证据以支持其有效性。