Concept erasure in text-to-image diffusion models aims to disable pre-trained diffusion models from generating images related to a target concept. To perform reliable concept erasure, the properties of robustness and locality are desirable. The former refrains the model from producing images associated with the target concept for any paraphrased or learned prompts, while the latter preserves the model ability in generating images for non-target concepts. In this paper, we propose Reliable Concept Erasing via Lightweight Erasers (Receler), which learns a lightweight Eraser to perform concept erasing and enhances locality and robustness with the proposed concept-localized regularization and adversarial prompt learning, respectively. Comprehensive quantitative and qualitative experiments with various concept prompts verify the superiority of Receler over the previous erasing methods on the above two desirable properties.
翻译:文本到图像扩散模型中的概念擦除旨在禁用预训练扩散模型生成与目标概念相关图像的能力。为实现可靠的概念擦除,需具备鲁棒性和局部性两个特性。前者阻止模型对任何释义或学习后的提示产生与目标概念相关的图像,而后者则保持模型生成非目标概念图像的能力。本文提出基于轻量化擦除器的可靠概念擦除方法(Receler),通过学习轻量级擦除器执行概念擦除,并分别通过所提出的概念局部正则化和对抗性提示学习增强局部性和鲁棒性。基于各种概念提示的全面定量与定性实验验证了Receler在以上两个理想特性上优于先前擦除方法。