There are demographic biases present in current facial recognition (FR) models. To measure these biases across different ethnic and gender subgroups, we introduce our Balanced Faces in the Wild (BFW) dataset. This dataset allows for the characterization of FR performance per subgroup. We found that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results. Additionally, performance within subgroups often varies significantly from the global average. Therefore, specific error rates only hold for populations that match the validation data. To mitigate imbalanced performances, we propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks. This scheme boosts the average performance and preserves identity information while removing demographic knowledge. Removing demographic knowledge prevents potential biases from affecting decision-making and protects privacy by eliminating demographic information. We explore the proposed method and demonstrate that subgroup classifiers can no longer learn from features projected using our domain adaptation scheme. For access to the source code and data, please visit https://github.com/visionjo/facerec-bias-bfw.
翻译:当前人脸识别模型存在人口统计偏见。为衡量不同种族和性别子群体中的这些偏见,我们引入了野外平衡人脸数据集(BFW)。该数据集能够表征各子群体的人脸识别性能。研究发现,依赖单一分数阈值区分真实匹配与冒名顶替样本对会导致次优结果。此外,各子群体的性能往往与全局平均值存在显著差异。因此,特定错误率仅对与验证数据分布匹配的人群有效。为缓解性能失衡问题,我们提出一种新颖的域适应学习方案,该方案利用从最先进神经网络中提取的面部特征,在提升平均性能的同时保留身份信息并消除人口统计知识。消除人口统计知识可防止潜在偏见影响决策,并通过移除人口统计信息来保护隐私。我们对该方法进行探索,实验表明子群体分类器无法再从经域适应方案投影的特征中学习。如需获取源代码和数据,请访问:https://github.com/visionjo/facerec-bias-bfw