Generative models trained on sensitive image datasets risk memorizing and reproducing individual training examples, making strong privacy guarantees essential. While differential privacy (DP) provides a principled framework for such guarantees, standard DP finetuning (e.g., with DP-SGD) often results in severe degradation of image quality, particularly in high-frequency textures, due to the indiscriminate addition of noise across all model parameters. In this work, we propose a spectral DP framework based on the hypothesis that the most privacy-sensitive portions of an image are often low-frequency components in the wavelet space (e.g., facial features and object shapes) while high-frequency components are largely generic and public. Based on this hypothesis, we propose the following two-stage framework for DP image generation with coarse image intermediaries: (1) DP finetune an autoregressive spectral image tokenizer model on the low-resolution wavelet coefficients of the sensitive images, and (2) perform high-resolution upsampling using a publicly pretrained super-resolution model. By restricting the privacy budget to the global structures of the image in the first stage, and leveraging the post-processing property of DP for detail refinement, we achieve promising trade-offs between privacy and utility. Experiments on the MS-COCO and MM-CelebA-HQ datasets show that our method generates images with improved quality and style capture relative to other leading DP image frameworks.
翻译:在敏感图像数据集上训练的生成模型存在记忆并复现单个训练样本的风险,因此需要强有力的隐私保障。虽然差分隐私(DP)为此类保障提供了原则性框架,但标准DP微调(例如使用DP-SGD)通常会导致图像质量严重下降,尤其是在高频纹理方面,这是由于对所有模型参数不加区分地添加噪声所致。在本研究中,我们提出了一种基于谱域的DP框架,其假设是:图像中最具隐私敏感性的部分通常是小波空间中的低频分量(例如面部特征和物体形状),而高频分量在很大程度上是通用且公开的。基于这一假设,我们提出了以下两阶段框架,用于通过粗粒度图像中间表示实现DP图像生成:(1)在敏感图像的低分辨率小波系数上对自回归谱域图像分词器模型进行DP微调;(2)使用公开预训练的超分辨率模型进行高分辨率上采样。通过在第一阶段将隐私预算限制于图像的全局结构,并利用DP的后处理特性进行细节优化,我们在隐私与效用之间实现了有前景的权衡。在MS-COCO和MM-CelebA-HQ数据集上的实验表明,相较于其他主流DP图像生成框架,我们的方法生成的图像在质量与风格捕捉方面均有提升。