Image generation models can generate or edit images from a given text. Recent advancements in image generation technology, exemplified by DALL-E and Midjourney, have been groundbreaking. These advanced models, despite their impressive capabilities, are often trained on massive Internet datasets, making them susceptible to generating content that perpetuates social stereotypes and biases, which can lead to severe consequences. Prior research on assessing bias within image generation models suffers from several shortcomings, including limited accuracy, reliance on extensive human labor, and lack of comprehensive analysis. In this paper, we propose BiasPainter, a novel evaluation framework that can accurately, automatically and comprehensively trigger social bias in image generation models. BiasPainter uses a diverse range of seed images of individuals and prompts the image generation models to edit these images using gender, race, and age-neutral queries. These queries span 62 professions, 39 activities, 57 types of objects, and 70 personality traits. The framework then compares the edited images to the original seed images, focusing on the significant changes related to gender, race, and age. BiasPainter adopts a key insight that these characteristics should not be modified when subjected to neutral prompts. Built upon this design, BiasPainter can trigger the social bias and evaluate the fairness of image generation models. We use BiasPainter to evaluate six widely-used image generation models, such as stable diffusion and Midjourney. Experimental results show that BiasPainter can successfully trigger social bias in image generation models. According to our human evaluation, BiasPainter can achieve 90.8% accuracy on automatic bias detection, which is significantly higher than the results reported in previous work.
翻译:图像生成模型能够根据给定文本生成或编辑图像。以DALL-E和Midjourney为代表的图像生成技术最新进展具有突破性意义。这些先进模型尽管功能强大,但通常基于海量互联网数据集进行训练,容易生成固化社会刻板印象和偏见的内容,可能导致严重后果。现有关于图像生成模型偏见评估的研究存在若干缺陷,包括准确度有限、依赖大量人工劳动以及缺乏全面分析。本文提出BiasPainter——一种能够准确、自动且全面触发图像生成模型社会偏见的新型评估框架。BiasPainter采用多样化的个体种子图像,通过性别、种族和年龄中立的查询提示图像生成模型编辑这些图像。这些查询涵盖62种职业、39项活动、57类物品及70种人格特质。该框架通过对比编辑后图像与原始种子图像,重点关注性别、种族和年龄相关的显著变化。BiasPainter基于关键洞见构建:当使用中立提示时,这些特征属性不应被修改。基于此设计,BiasPainter能够触发社会偏见并评估图像生成模型的公平性。我们使用BiasPainter评估了包括stable diffusion和Midjourney在内的六种常用图像生成模型。实验结果表明,BiasPainter能成功触发图像生成模型中的社会偏见。根据人工评估结果,BiasPainter在自动偏见检测方面达到90.8%的准确率,显著优于已有研究成果。