As machine learning-enabled Text-to-Image (TTI) systems are becoming increasingly prevalent and seeing growing adoption as commercial services, characterizing the social biases they exhibit is a necessary first step to lowering their risk of discriminatory outcomes. This evaluation, however, is made more difficult by the synthetic nature of these systems' outputs; since artificial depictions of fictive humans have no inherent gender or ethnicity nor do they belong to socially-constructed groups, we need to look beyond common categorizations of diversity or representation. To address this need, we propose a new method for exploring and quantifying social biases in TTI systems by directly comparing collections of generated images designed to showcase a system's variation across social attributes -- gender and ethnicity -- and target attributes for bias evaluation -- professions and gender-coded adjectives. Our approach allows us to (i) identify specific bias trends through visualization tools, (ii) provide targeted scores to directly compare models in terms of diversity and representation, and (iii) jointly model interdependent social variables to support a multidimensional analysis. We use this approach to analyze over 96,000 images generated by 3 popular TTI systems (DALL-E 2, Stable Diffusion v 1.4 and v 2) and find that all three significantly over-represent the portion of their latent space associated with whiteness and masculinity across target attributes; among the systems studied, DALL-E 2 shows the least diversity, followed by Stable Diffusion v2 then v1.4.
翻译:随着基于机器学习的文本到图像(TTI)系统日益普及并被广泛采纳为商业服务,刻画其表现出的社会偏见是降低其歧视性结果风险的必要第一步。然而,由于这些系统输出的合成性质,这种评估变得更加困难:由于虚构人物的合成图像既无固有性别或种族,也不属于社会建构的群体,我们需要超越常见的多样性或表征分类。为应对这一需求,我们提出一种新方法,通过直接比较为展示系统在社会属性(性别与种族)以及偏差评估目标属性(职业与性别编码形容词)上变异而设计的生成图像集合,来探索并量化TTI系统中的社会偏见。我们的方法能够(i)通过可视化工具识别具体偏差趋势,(ii)提供针对性评分以直接比较模型在多样性和表征方面的表现,以及(iii)联合建模相互依赖的社会变量以支持多维度分析。我们运用该方法分析了由3个主流TTI系统(DALL-E 2、Stable Diffusion v1.4和v2)生成的超过96,000张图像,并发现这三个系统均显著过度表征其潜在空间中与白人和男性特质相关的部分;在所研究的系统中,DALL-E 2的多样性最低,其次是Stable Diffusion v2,最后是v1.4。