Computer vision models have been known to encode harmful biases, leading to the potentially unfair treatment of historically marginalized groups, such as people of color. However, there remains a lack of datasets balanced along demographic traits that can be used to evaluate the downstream fairness of these models. In this work, we demonstrate that diffusion models can be leveraged to create such a dataset. We first use a diffusion model to generate a large set of images depicting various occupations. Subsequently, each image is edited using inpainting to generate multiple variants, where each variant refers to a different perceived race. Using this dataset, we benchmark several vision-language models on a multi-class occupation classification task. We find that images generated with non-Caucasian labels have a significantly higher occupation misclassification rate than images generated with Caucasian labels, and that several misclassifications are suggestive of racial biases. We measure a model's downstream fairness by computing the standard deviation in the probability of predicting the true occupation label across the different perceived identity groups. Using this fairness metric, we find significant disparities between the evaluated vision-and-language models. We hope that our work demonstrates the potential value of diffusion methods for fairness evaluations.
翻译:计算机视觉模型已被证实会编码有害偏见,导致对历史上边缘化群体(如有色人种)可能产生不公平对待。然而,目前仍缺乏按人口特征维度平衡的数据集来评估这些模型的下游公平性。本研究证明,扩散模型可用于构建此类数据集。我们首先使用扩散模型生成包含各类职业的海量图片集,随后通过图像修补技术对每张图片进行编辑,生成多个变体——每个变体对应不同的感知种族类别。基于该数据集,我们对多个视觉-语言模型进行了多职业分类任务基准测试。研究发现,标注为非白人族裔的图片职业误分类率显著高于标注为白人族裔的图片,且部分误分类结果暗示种族偏见。通过计算模型对不同感知身份群体预测真实职业标签的概率标准差,我们量化了模型的下游公平性。采用该公平性度量指标后,我们发现被评估的视觉-语言模型之间存在显著差异。希望本研究能证明扩散方法在公平性评估中的潜在价值。