Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications. However, since they are highly data-driven, relying on billion-sized datasets randomly scraped from the internet, they also suffer from degenerated and biased human behavior, as we demonstrate. In fact, they may even reinforce such biases. To not only uncover but also combat these undesired effects, we present a novel strategy, called Fair Diffusion, to attenuate biases after the deployment of generative text-to-image models. Specifically, we demonstrate shifting a bias, based on human instructions, in any direction yielding arbitrarily new proportions for, e.g., identity groups. As our empirical evaluation demonstrates, this introduced control enables instructing generative image models on fairness, with no data filtering and additional training required.
翻译:生成式AI模型近期在质量上取得了惊人成果,因而被广泛应用于快速增长的应用领域。然而,由于这些模型高度依赖数据驱动,其训练数据来自从互联网随机抓取的数十亿级数据集,正如我们所证实的,它们也反映出人类行为中退化的偏见。事实上,这些模型甚至可能强化此类偏见。为了不仅揭示而且消除这些不良影响,我们提出了一种名为"公平扩散"的新策略,用于在生成式文本到图像模型部署后缓解其偏见。具体而言,我们展示了基于人类指令向任意方向转移偏见的能力,例如为身份群体生成任意新的比例分配。我们的实证评估表明,这种引入的控制机制能够在无需数据过滤和额外训练的情况下,指导生成式图像模型实现公平性。