Balancing Act: Distribution-Guided Debiasing in Diffusion Models

Diffusion Models (DMs) have emerged as powerful generative models with unprecedented image generation capability. These models are widely used for data augmentation and creative applications. However, DMs reflect the biases present in the training datasets. This is especially concerning in the context of faces, where the DM prefers one demographic subgroup vs others (eg. female vs male). In this work, we present a method for debiasing DMs without relying on additional data or model retraining. Specifically, we propose Distribution Guidance, which enforces the generated images to follow the prescribed attribute distribution. To realize this, we build on the key insight that the latent features of denoising UNet hold rich demographic semantics, and the same can be leveraged to guide debiased generation. We train Attribute Distribution Predictor (ADP) - a small mlp that maps the latent features to the distribution of attributes. ADP is trained with pseudo labels generated from existing attribute classifiers. The proposed Distribution Guidance with ADP enables us to do fair generation. Our method reduces bias across single/multiple attributes and outperforms the baseline by a significant margin for unconditional and text-conditional diffusion models. Further, we present a downstream task of training a fair attribute classifier by rebalancing the training set with our generated data.

翻译：扩散模型（DMs）已成为具有前所未有的图像生成能力的强大生成模型，广泛应用于数据增强和创意任务。然而，DMs反映了训练数据集中存在的偏差。在涉及人脸的场景中，这一问题尤为突出——DM倾向于偏好某一人口统计学子群（如女性对男性）。本文提出一种无需额外数据或模型重训练即可对DMs进行去偏的方法。具体而言，我们提出分布引导（Distribution Guidance），强制生成图像遵循指定的属性分布。基于去噪UNet的潜在特征蕴含丰富的人口统计学语义这一关键发现，我们利用该特征引导无偏生成。为此，我们训练属性分布预测器（ADP）——一种将潜在特征映射到属性分布的小型多层感知机。ADP利用现有属性分类器生成的伪标签进行训练。通过结合ADP的分布引导机制，我们实现了公平生成。该方法可有效降低单/多属性间的偏差，在无条件和文本条件扩散模型上均显著超越基线。进一步地，我们展示了如何利用生成数据重新平衡训练集，从而训练公平属性分类器的下游任务。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日