Generative AI models have recently achieved astonishing results in quality and are consequently employed in a fast-growing number of applications. However, since they are highly data-driven, relying on billion-sized datasets randomly scraped from the internet, they also suffer from degenerated and biased human behavior, as we demonstrate. In fact, they may even reinforce such biases. To not only uncover but also combat these undesired effects, we present a novel strategy, called Fair Diffusion, to attenuate biases after the deployment of generative text-to-image models. Specifically, we demonstrate shifting a bias, based on human instructions, in any direction yielding arbitrarily new proportions for, e.g., identity groups. As our empirical evaluation demonstrates, this introduced control enables instructing generative image models on fairness, with no data filtering and additional training required.
翻译:生成式AI模型近期在质量上取得了惊人成果,因此被广泛应用于快速增长的应用场景中。然而,由于这些模型高度依赖数据驱动,依赖从互联网随机抓取的数十亿规模数据集,我们证明了它们同样继承了退化且带有偏见的人类行为模式。事实上,这些模型甚至可能强化此类偏见。为揭示并克服这些不良影响,我们提出了一种名为"公平扩散"的新策略,可在生成式文本到图像模型部署后减轻其偏差。具体而言,我们证明基于人类指令可将偏差向任意方向转移,从而为身份群体等维度生成任意新的比例。实证评估表明,这种引入的控制机制能够指导生成式图像模型实现公平性,且无需进行数据过滤或额外训练。