Diffusion models are highly effective at generating high-quality images but pose risks, such as the unintentional generation of NSFW (not safe for work) content. Although various techniques have been proposed to mitigate unwanted influences in diffusion models while preserving overall performance, achieving a balance between these goals remains challenging. In this work, we introduce EraseDiff, an algorithm designed to preserve the utility of the diffusion model on retained data while removing the unwanted information associated with the data to be forgotten. Our approach formulates this task as a constrained optimization problem using the value function, resulting in a natural first-order algorithm for solving the optimization problem. By altering the generative process to deviate away from the ground-truth denoising trajectory, we update parameters for preservation while controlling constraint reduction to ensure effective erasure, striking an optimal trade-off. Extensive experiments and thorough comparisons with state-of-the-art algorithms demonstrate that EraseDiff effectively preserves the model's utility, efficacy, and efficiency.
翻译:扩散模型在生成高质量图像方面表现出色,但存在一定风险,例如可能无意中生成不适合工作场所(NSFW)的内容。尽管已有多种技术被提出,旨在减轻扩散模型中的不良影响同时保持整体性能,但在这些目标之间取得平衡仍然具有挑战性。在本工作中,我们提出了EraseDiff算法,该算法旨在保留扩散模型在保留数据上的效用,同时移除与待遗忘数据相关的不良信息。我们的方法利用价值函数将此任务表述为一个约束优化问题,从而得到一个求解该优化问题的自然一阶算法。通过改变生成过程,使其偏离真实去噪轨迹,我们更新参数以实现保留,同时控制约束减少以确保有效擦除,从而达成最优权衡。大量实验以及与最先进算法的全面比较表明,EraseDiff能有效保持模型的效用、效能和效率。