Balancing Act: Distribution-Guided Debiasing in Diffusion Models

Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male). In this work, we present a method for debiasing DMs without relying on additional data or model retraining. Specifically, we propose Distribution Guidance, which enforces the generated images to follow the prescribed attribute distribution. To realize this, we build on the key insight that the latent features of denoising UNet hold rich demographic semantics, and the same can be leveraged to guide debiased generation. We train Attribute Distribution Predictor (ADP) - a small mlp that maps the latent features to the distribution of attributes. ADP is trained with pseudo labels generated from existing attribute classifiers. The proposed Distribution Guidance with ADP enables us to do fair generation. Our method reduces bias across single/multiple attributes and outperforms the baseline by a significant margin for unconditional and text-conditional diffusion models. Further, we present a downstream task of training a fair attribute classifier by rebalancing the training set with our generated data.

翻译：扩散模型已成为具有前所未有的图像生成能力的强大生成模型。这些模型被广泛用于数据增强和创意应用。然而，扩散模型反映了训练数据集中存在的偏见。这在人脸生成场景中尤其令人担忧，因为扩散模型往往偏向某个人口统计子群体（例如女性相对于男性）。在本工作中，我们提出了一种无需依赖额外数据或模型重新训练即可实现扩散模型去偏的方法。具体而言，我们提出了分布引导技术，该技术强制生成的图像遵循预设的属性分布。为实现这一目标，我们基于一个关键见解：去噪UNet的潜在特征蕴含丰富的人口统计语义，可以利用这些特征来引导无偏见的生成。我们训练了一个属性分布预测器——这是一个将潜在特征映射到属性分布的小型多层感知机。该预测器使用现有属性分类器生成的伪标签进行训练。所提出的结合属性分布预测器的分布引导技术使我们能够实现公平生成。我们的方法在单属性和多属性场景下均能有效减少偏见，并且在无条件与文本条件扩散模型中均显著优于基线方法。此外，我们提出了一个下游任务：通过使用生成数据重新平衡训练集来训练公平的属性分类器。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日