Image generative models, particularly diffusion-based models, have surged in popularity due to their remarkable ability to synthesize highly realistic images. However, since these models are data-driven, they inherit biases from the training datasets, frequently leading to disproportionate group representations that exacerbate societal inequities. Traditionally, efforts to debiase these models have relied on predefined sensitive attributes, classifiers trained on such attributes, or large language models to steer outputs toward fairness. However, these approaches face notable drawbacks: predefined attributes do not adequately capture complex and continuous variations among groups. To address these issues, we introduce the Debiasing Diffusion Model (DDM), which leverages an indicator to learn latent representations during training, promoting fairness through balanced representations without requiring predefined sensitive attributes. This approach not only demonstrates its effectiveness in scenarios previously addressed by conventional techniques but also enhances fairness without relying on predefined sensitive attributes as conditions. In this paper, we discuss the limitations of prior bias mitigation techniques in diffusion-based models, elaborate on the architecture of the DDM, and validate the effectiveness of our approach through experiments.
翻译:图像生成模型,尤其是基于扩散的模型,因其合成高度逼真图像的卓越能力而广受欢迎。然而,由于这些模型是数据驱动的,它们继承了训练数据集中的偏见,常常导致群体表征失衡,从而加剧社会不平等。传统上,对这些模型进行去偏的努力依赖于预定义的敏感属性、基于此类属性训练的分类器或大型语言模型来引导输出趋向公平。然而,这些方法存在显著缺陷:预定义属性无法充分捕捉群体间复杂且连续的变化。为解决这些问题,我们提出了去偏扩散模型(DDM),该模型利用一个指示器在训练过程中学习潜在表征,通过平衡的表征促进公平性,而无需预定义的敏感属性。这种方法不仅证明了其在传统技术已处理场景中的有效性,还能在不依赖预定义敏感属性作为条件的情况下增强公平性。本文讨论了先前基于扩散模型的偏见缓解技术的局限性,详细阐述了DDM的架构,并通过实验验证了我们方法的有效性。