In the last few years, Artificial Intelligence systems have become increasingly widespread. Unfortunately, these systems can share many biases with human decision-making, including demographic biases. Often, these biases can be traced back to the data used for training, where large uncurated datasets have become the norm. Despite our knowledge of these biases, we still lack general tools to detect and quantify them, as well as to compare the biases in different datasets. Thus, in this work, we propose DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of two datasets. DSAP can be deployed in three key applications: to detect and characterize demographic blind spots and bias issues across datasets, to measure dataset demographic bias in single datasets, and to measure dataset demographic shift in deployment scenarios. An essential feature of DSAP is its ability to robustly analyze datasets without explicit demographic labels, offering simplicity and interpretability for a wide range of situations. To show the usefulness of the proposed methodology, we consider the Facial Expression Recognition task, where demographic bias has previously been found. The three applications are studied over a set of twenty datasets with varying properties. The code is available at https://github.com/irisdominguez/DSAP.
翻译:在过去的几年中,人工智能系统变得越来越普及。不幸的是,这些系统可能继承人类决策中的多种偏见,包括群体偏见。这些偏见往往可以追溯到用于训练的数据——如今,大规模的未整理数据集已成为常态。尽管我们已经认识到这些偏见的存在,但仍缺乏通用的工具来检测和量化它们,以及比较不同数据集中的偏见。因此,在本工作中,我们提出DSAP(基于辅助画像的群体相似度),一种比较两个数据集群体构成的两步方法。DSAP可应用于三个关键场景:检测和描述跨数据集中的群体盲点及偏差问题、衡量单个数据集中的群体偏差,以及在部署场景中衡量数据集的群体偏移。DSAP的一个核心特性是,即使数据缺乏显式的群体标签,它也能稳健地进行分析,从而在广泛情境中提供简洁性与可解释性。为展示所提方法的有用性,我们考虑了面部表情识别任务——该任务此前已被发现存在群体偏差。我们基于二十个具有不同属性的数据集对上述三个应用进行了研究。代码已开源,获取地址:https://github.com/irisdominguez/DSAP。