Diffusion models (DMs) are widely used for generating high-quality high-dimensional images in a non-differentially private manner. To address this challenge, recent papers suggest pre-training DMs with public data, then fine-tuning them with private data using DP-SGD for a relatively short period. In this paper, we further improve the current state of DMs with DP by adopting the Latent Diffusion Models (LDMs). LDMs are equipped with powerful pre-trained autoencoders that map the high-dimensional pixels into lower-dimensional latent representations, in which DMs are trained, yielding a more efficient and fast training of DMs. In our algorithm, DP-LDMs, rather than fine-tuning the entire DMs, we fine-tune only the attention modules of LDMs at varying layers with privacy-sensitive data, reducing the number of trainable parameters by roughly 90% and achieving a better accuracy, compared to fine-tuning the entire DMs. The smaller parameter space to fine-tune with DP-SGD helps our algorithm to achieve new state-of-the-art results in several public-private benchmark data pairs.Our approach also allows us to generate more realistic, high-dimensional images (256x256) and those conditioned on text prompts with differential privacy, which have not been attempted before us, to the best of our knowledge. Our approach provides a promising direction for training more powerful, yet training-efficient differentially private DMs, producing high-quality high-dimensional DP images.
翻译:扩散模型被广泛用于以非差分隐私方式生成高质量高维图像。为解决这一挑战,近期研究提出先使用公开数据预训练扩散模型,再通过DP-SGD对私有数据进行短时间微调。本文通过采用潜在扩散模型进一步改进了当前差分隐私扩散模型的技术水平。潜在扩散模型配备了强大的预训练自编码器,可将高维像素映射到低维潜在表示空间中进行扩散模型训练,从而实现了更高效快速的扩散模型训练。在我们的算法DP-LDMs中,我们并非微调整个扩散模型,而是仅针对不同层级的注意力模块使用隐私敏感数据进行参数更新,这使得可训练参数数量减少约90%,且相比全模型微调获得了更高精度。采用DP-SGD微调更小的参数空间,有助于我们的算法在多个公共-私有基准数据对中取得新的最优结果。我们的方法还能生成更逼真的高维图像(256x256),以及基于文本提示的差分隐私图像——据我们所知,此前尚未有研究实现这一突破。本方法为训练更强大且训练高效的差分隐私扩散模型指明了新方向,可生成高质量的高维差分隐私图像。