AI systems have been shown to produce unfair results for certain subgroups of population, highlighting the need to understand bias on certain sensitive attributes. Current research often falls short, primarily focusing on the subgroups characterized by a single sensitive attribute, while neglecting the nature of intersectional fairness of multiple sensitive attributes. This paper focuses on its one fundamental aspect by discovering diverse high-bias subgroups under intersectional sensitive attributes. Specifically, we propose a Bias-Guided Generative Network (BGGN). By treating each bias value as a reward, BGGN efficiently generates high-bias intersectional sensitive attributes. Experiments on real-world text and image datasets demonstrate a diverse and efficient discovery of BGGN. To further evaluate the generated unseen but possible unfair intersectional sensitive attributes, we formulate them as prompts and use modern generative AI to produce new texts and images. The results of frequently generating biased data provides new insights of discovering potential unfairness in popular modern generative AI systems. Warning: This paper contains generative examples that are offensive in nature.
翻译:人工智能系统已被证明会对特定人口亚群产生不公平结果,这凸显了理解特定敏感属性偏见的必要性。当前研究往往存在不足,主要关注以单一敏感属性为特征的亚群,而忽视了多重敏感属性的交叉公平性本质。本文通过发现交叉敏感属性下多样化的高偏见亚群,聚焦于该问题的一个基本层面。具体而言,我们提出了一种偏见引导生成网络(BGGN)。通过将每个偏见值视为奖励,BGGN能高效生成具有高偏见的交叉敏感属性。在真实世界文本和图像数据集上的实验证明了BGGN的多样化且高效的发现能力。为进一步评估生成的、未见但可能不公平的交叉敏感属性,我们将其表述为提示词,并利用现代生成式人工智能生成新的文本和图像。频繁生成偏见数据的结果为发现流行现代生成式人工智能系统中潜在不公平性提供了新视角。警告:本文包含本质上具有冒犯性的生成示例。