All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models

Text-to-Image models such as Stable Diffusion have shown impressive image generation synthesis, thanks to the utilization of large-scale datasets. However, these datasets may contain sexually explicit, copyrighted, or undesirable content, which allows the model to directly generate them. Given that retraining these large models on individual concept deletion requests is infeasible, fine-tuning algorithms have been developed to tackle concept erasing in diffusion models. While these algorithms yield good concept erasure, they all present one of the following issues: 1) the corrupted feature space yields synthesis of disintegrated objects, 2) the initially synthesized content undergoes a divergence in both spatial structure and semantics in the generated images, and 3) sub-optimal training updates heighten the model's susceptibility to utility harm. These issues severely degrade the original utility of generative models. In this work, we present a new approach that solves all of these challenges. We take inspiration from the concept of classifier guidance and propose a surgical update on the classifier guidance term while constraining the drift of the unconditional score term. Furthermore, our algorithm empowers the user to select an alternative to the erasing concept, allowing for more controllability. Our experimental results show that our algorithm not only erases the target concept effectively but also preserves the model's generation capability.

翻译：文本到图像模型（如Stable Diffusion）凭借大规模数据集的利用展现出令人瞩目的图像生成能力。然而，这些数据集可能包含露骨色情、受版权保护或不受欢迎的内容，导致模型能够直接生成此类图像。鉴于针对单个概念删除需求重新训练这些大型模型不可行，研究者开发了微调算法以应对扩散模型中的概念擦除问题。尽管现有算法能实现有效的概念擦除，但均存在以下问题之一：1）特征空间受损导致生成破碎物体；2）初始合成内容在生成图像的空间结构与语义上出现偏差；3）次优的训练更新加剧模型对效用损害敏感性。这些问题严重削弱了生成模型的原始效用。本研究提出了一种解决所有前述挑战的新方法。我们从分类器引导的概念中汲取灵感，提出在约束无条件得分项漂移的同时，对分类器引导项进行手术式更新。此外，我们的算法赋予用户选择替代擦除概念的能力，从而增强可控性。实验结果表明，该算法不仅能有效擦除目标概念，还能保持模型的生成能力。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日