InvDiff: Invariant Guidance for Bias Mitigation in Diffusion Models

As one of the most successful generative models, diffusion models have demonstrated remarkable efficacy in synthesizing high-quality images. These models learn the underlying high-dimensional data distribution in an unsupervised manner. Despite their success, diffusion models are highly data-driven and prone to inheriting the imbalances and biases present in real-world data. Some studies have attempted to address these issues by designing text prompts for known biases or using bias labels to construct unbiased data. While these methods have shown improved results, real-world scenarios often contain various unknown biases, and obtaining bias labels is particularly challenging. In this paper, we emphasize the necessity of mitigating bias in pre-trained diffusion models without relying on auxiliary bias annotations. To tackle this problem, we propose a framework, InvDiff, which aims to learn invariant semantic information for diffusion guidance. Specifically, we propose identifying underlying biases in the training data and designing a novel debiasing training objective. Then, we employ a lightweight trainable module that automatically preserves invariant semantic information and uses it to guide the diffusion model's sampling process toward unbiased outcomes simultaneously. Notably, we only need to learn a small number of parameters in the lightweight learnable module without altering the pre-trained diffusion model. Furthermore, we provide a theoretical guarantee that the implementation of InvDiff is equivalent to reducing the error upper bound of generalization. Extensive experimental results on three publicly available benchmarks demonstrate that InvDiff effectively reduces biases while maintaining the quality of image generation. Our code is available at https://github.com/Hundredl/InvDiff.

翻译：作为最成功的生成模型之一，扩散模型在合成高质量图像方面展现出卓越的性能。这些模型以无监督的方式学习底层的高维数据分布。尽管取得了成功，扩散模型高度依赖数据，容易继承现实世界数据中存在的不平衡和偏差。一些研究尝试通过为已知偏差设计文本提示或使用偏差标签构建无偏数据来解决这些问题。虽然这些方法已显示出改进效果，但现实场景通常包含各种未知偏差，且获取偏差标签尤其具有挑战性。在本文中，我们强调在不依赖辅助偏差标注的情况下缓解预训练扩散模型中偏差的必要性。为解决此问题，我们提出了一个框架 InvDiff，旨在学习用于扩散引导的不变语义信息。具体而言，我们提出识别训练数据中的潜在偏差，并设计一种新颖的去偏差训练目标。随后，我们采用一个轻量级的可训练模块，该模块自动保留不变语义信息，并同时利用该信息引导扩散模型的采样过程以产生无偏结果。值得注意的是，我们仅需学习轻量级可学习模块中的少量参数，而无需改变预训练的扩散模型。此外，我们提供了理论保证，证明 InvDiff 的实现等价于降低泛化误差的上界。在三个公开基准上的大量实验结果表明，InvDiff 在保持图像生成质量的同时，有效减少了偏差。我们的代码可在 https://github.com/Hundredl/InvDiff 获取。