Conditional Generative Adversarial Networks (CGANs) exhibit significant potential in supervised learning model training by virtue of their ability to generate realistic labeled images. However, numerous studies have indicated the privacy leakage risk in CGANs models. The solution DPCGAN, incorporating the differential privacy framework, faces challenges such as heavy reliance on labeled data for model training and potential disruptions to original gradient information due to excessive gradient clipping, making it difficult to ensure model accuracy. To address these challenges, we present a privacy-preserving training framework called PATE-TripleGAN. This framework incorporates a classifier to pre-classify unlabeled data, establishing a three-party min-max game to reduce dependence on labeled data. Furthermore, we present a hybrid gradient desensitization algorithm based on the Private Aggregation of Teacher Ensembles (PATE) framework and Differential Private Stochastic Gradient Descent (DPSGD) method. This algorithm allows the model to retain gradient information more effectively while ensuring privacy protection, thereby enhancing the model's utility. Privacy analysis and extensive experiments affirm that the PATE-TripleGAN model can generate a higher quality labeled image dataset while ensuring the privacy of the training data.
翻译:条件生成对抗网络(CGANs)凭借其生成逼真标注图像的能力,在监督学习模型训练中展现出巨大潜力。然而,大量研究表明CGANs模型存在隐私泄露风险。采用差分隐私框架的DPCGAN解决方案面临诸多挑战,例如模型训练严重依赖标注数据,以及过度梯度裁剪可能破坏原始梯度信息,导致模型准确性难以保证。为解决这些问题,我们提出了一种名为PATE-TripleGAN的隐私保护训练框架。该框架引入分类器对未标注数据进行预分类,建立三方极小-极大博弈以减少对标注数据的依赖。此外,我们提出了一种基于教师集成私有聚合(PATE)框架和差分隐私随机梯度下降(DPSGD)方法的混合梯度脱敏算法。该算法在确保隐私保护的同时,使模型能更有效地保留梯度信息,从而提升模型效用。隐私分析与大量实验证明,PATE-TripleGAN模型能在保证训练数据隐私的前提下,生成更高质量的标注图像数据集。