Diffusion models, known for their tremendous ability to generate novel and high-quality samples, have recently raised concerns due to their data memorization behavior, which poses privacy risks. Recent approaches for memory mitigation either only focused on the text modality problem in cross-modal generation tasks or utilized data augmentation strategies. In this paper, we propose a novel training framework for diffusion models from the perspective of visual modality, which is more generic and fundamental for mitigating memorization. To facilitate forgetting of stored information in diffusion model parameters, we propose an iterative ensemble training strategy by splitting the data into multiple shards for training multiple models and intermittently aggregating these model parameters. Moreover, practical analysis of losses illustrates that the training loss for easily memorable images tends to be obviously lower. Thus, we propose an anti-gradient control method to exclude the sample with a lower loss value from the current mini-batch to avoid memorizing. Extensive experiments and analysis on four datasets are conducted to illustrate the effectiveness of our method, and results show that our method successfully reduces memory capacity while even improving the performance slightly. Moreover, to save the computing cost, we successfully apply our method to fine-tune the well-trained diffusion models by limited epochs, demonstrating the applicability of our method. Code is available in https://github.com/liuxiao-guan/IET_AGC.
翻译:扩散模型以其生成新颖且高质量样本的强大能力而闻名,但近期因其数据记忆行为引发了隐私风险的担忧。现有的记忆缓解方法要么仅关注跨模态生成任务中的文本模态问题,要么依赖于数据增强策略。本文从视觉模态的角度提出了一种新颖的扩散模型训练框架,该方法对于缓解记忆效应更具普适性和基础性。为促进扩散模型参数中存储信息的遗忘,我们提出一种迭代集成训练策略:将数据分割为多个分片用于训练多个模型,并间歇性地聚合这些模型参数。此外,对损失函数的实际分析表明,易记忆图像的训练损失往往明显偏低。因此,我们提出一种反梯度控制方法,从当前小批量中排除损失值较低的样本,以避免记忆。我们在四个数据集上进行了大量实验与分析,结果证明了本方法的有效性:在略微提升模型性能的同时,成功降低了记忆容量。此外,为节省计算成本,我们成功将本方法应用于对已训练扩散模型进行有限轮次的微调,证明了本方法的适用性。代码发布于 https://github.com/liuxiao-guan/IET_AGC。