Text-to-image diffusion models (T2I DMs), represented by Stable Diffusion, which generate highly realistic images based on textual input, have been widely used, but their flexibility also makes them prone to misuse for producing harmful or unsafe content. Concept unlearning has been used to prevent text-to-image diffusion models from being misused to generate undesirable visual content. However, existing methods struggle to trade off unlearning effectiveness with the preservation of generation quality. To address this limitation, we propose Key Step Concept Unlearning (KSCU), which selectively fine-tunes the model at key steps to the target concept. KSCU is inspired by the fact that different diffusion denoising steps contribute unequally to the final generation. Compared to previous approaches, which treat all denoising steps uniformly, KSCU avoids over-optimization of unnecessary steps for higher effectiveness and reduces the number of parameter updates for higher efficiency. For example, on the I2P dataset, KSCU outperforms ESD by 8.3% in nudity unlearning accuracy while improving FID by 8.4%, and achieves a high overall score of 0.92, substantially surpassing all other SOTA methods.
翻译:以Stable Diffusion为代表的文本到图像扩散模型(T2I DMs)能够根据文本输入生成高度逼真的图像,已获得广泛应用,但其灵活性也使其容易被滥用于生成有害或不安全内容。概念遗忘技术已被用于防止文本到图像扩散模型被滥用于生成不良视觉内容。然而,现有方法难以在遗忘效果与生成质量保持之间取得平衡。为突破此局限,我们提出关键步骤概念遗忘(KSCU)方法,该方法选择性地在针对目标概念的关键步骤对模型进行微调。KSCU的提出基于以下发现:扩散去噪过程中不同步骤对最终生成的贡献并不均衡。相较于以往对所有去噪步骤进行统一处理的方法,KSCU通过避免对非必要步骤的过度优化来提升遗忘效果,并通过减少参数更新次数来提高效率。例如在I2P数据集上,KSCU在色情内容遗忘准确率上比ESD方法提升8.3%,同时将FID指标改善8.4%,并取得0.92的综合高分,显著超越所有其他前沿方法。