Text-conditioned image generation models have recently achieved astonishing results in image quality and text alignment and are consequently employed in a fast-growing number of applications. Since they are highly data-driven, relying on billion-sized datasets randomly scraped from the internet, they also suffer, as we demonstrate, from degenerated and biased human behavior. In turn, they may even reinforce such biases. To help combat these undesired side effects, we present safe latent diffusion (SLD). Specifically, to measure the inappropriate degeneration due to unfiltered and imbalanced training sets, we establish a novel image generation test bed-inappropriate image prompts (I2P)-containing dedicated, real-world image-to-text prompts covering concepts such as nudity and violence. As our exhaustive empirical evaluation demonstrates, the introduced SLD removes and suppresses inappropriate image parts during the diffusion process, with no additional training required and no adverse effect on overall image quality or text alignment.
翻译:文本条件图像生成模型近期在图像质量和文本对齐方面取得了惊人成果,因此被广泛应用于快速增长的应用领域。由于这些模型高度依赖数据驱动,使用从互联网随机抓取的数十亿级数据集,我们证明它们同样存在退化和偏见的人类行为问题,甚至可能强化此类偏见。为帮助应对这些不良副作用,我们提出了安全潜在扩散(SLD)。具体而言,为衡量因未经筛选和失衡训练集导致的不当退化,我们建立了一个新颖的图像生成测试平台——不当图像提示(I2P),其中包含涉及裸体、暴力等概念的专用真实世界图像到文本提示。通过详尽的实证评估表明,所引入的SLD能在扩散过程中移除并抑制不当图像部分,且无需额外训练,对整体图像质量或文本对齐也无负面影响。