Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male). In this work, we present a method for debiasing DMs without relying on additional data or model retraining. Specifically, we propose Distribution Guidance, which enforces the generated images to follow the prescribed attribute distribution. To realize this, we build on the key insight that the latent features of denoising UNet hold rich demographic semantics, and the same can be leveraged to guide debiased generation. We train Attribute Distribution Predictor (ADP) - a small mlp that maps the latent features to the distribution of attributes. ADP is trained with pseudo labels generated from existing attribute classifiers. The proposed Distribution Guidance with ADP enables us to do fair generation. Our method reduces bias across single/multiple attributes and outperforms the baseline by a significant margin for unconditional and text-conditional diffusion models. Further, we present a downstream task of training a fair attribute classifier by rebalancing the training set with our generated data.
翻译:扩散模型已成为具有前所未有的图像生成能力的强大生成模型。这些模型被广泛用于数据增强和创意应用。然而,扩散模型反映了训练数据集中存在的偏见。这在人脸生成场景中尤其令人担忧,因为扩散模型往往偏向某个人口统计子群体(例如女性相对于男性)。在本工作中,我们提出了一种无需依赖额外数据或模型重新训练即可实现扩散模型去偏的方法。具体而言,我们提出了分布引导技术,该技术强制生成的图像遵循预设的属性分布。为实现这一目标,我们基于一个关键见解:去噪UNet的潜在特征蕴含丰富的人口统计语义,可以利用这些特征来引导无偏见的生成。我们训练了一个属性分布预测器——这是一个将潜在特征映射到属性分布的小型多层感知机。该预测器使用现有属性分类器生成的伪标签进行训练。所提出的结合属性分布预测器的分布引导技术使我们能够实现公平生成。我们的方法在单属性和多属性场景下均能有效减少偏见,并且在无条件与文本条件扩散模型中均显著优于基线方法。此外,我们提出了一个下游任务:通过使用生成数据重新平衡训练集来训练公平的属性分类器。