Text-to-image (T2I) diffusion models often inadvertently generate unwanted concepts such as watermarks and unsafe images. These concepts, termed as the "implicit concepts", could be unintentionally learned during training and then be generated uncontrollably during inference. Existing removal methods still struggle to eliminate implicit concepts primarily due to their dependency on the model's ability to recognize concepts it actually can not discern. To address this, we utilize the intrinsic geometric characteristics of implicit concepts and present the Geom-Erasing, a novel concept removal method based on the geometric-driven control. Specifically, once an unwanted implicit concept is identified, we integrate the existence and geometric information of the concept into the text prompts with the help of an accessible classifier or detector model. Subsequently, the model is optimized to identify and disentangle this information, which is then adopted as negative prompts during generation. Moreover, we introduce the Implicit Concept Dataset (ICD), a novel image-text dataset imbued with three typical implicit concepts (i.e., QR codes, watermarks, and text), reflecting real-life situations where implicit concepts are easily injected. Geom-Erasing effectively mitigates the generation of implicit concepts, achieving the state-of-the-art results on the Inappropriate Image Prompts (I2P) and our challenging Implicit Concept Dataset (ICD) benchmarks.
翻译:文本到图像(T2I)扩散模型常常会无意中生成水印和不安全图像等不需要的概念。这些被称为“隐式概念”的内容,可能在训练过程中被无意学习,进而在推理阶段不受控制地生成。现有的移除方法主要因其依赖于模型识别其实际无法辨别的概念的能力,仍难以有效消除隐式概念。为解决这一问题,我们利用隐式概念的内在几何特性,提出了Geom-Erasing——一种基于几何驱动控制的新型概念移除方法。具体而言,一旦识别出不需要的隐式概念,我们借助一个可访问的分类器或检测器模型,将该概念的存在性和几何信息整合到文本提示中。随后,优化模型以识别并解耦这些信息,并在生成阶段将其用作负向提示。此外,我们引入了隐式概念数据集(ICD),这是一个包含三种典型隐式概念(即二维码、水印和文本)的新型图文数据集,反映了隐式概念易于注入的现实场景。Geom-Erasing有效抑制了隐式概念的生成,在不恰当图像提示(I2P)和我们提出的具有挑战性的隐式概念数据集(ICD)基准测试中取得了最先进的结果。