Face recognition and verification are two computer vision tasks whose performance has progressed with the introduction of deep representations. However, ethical, legal, and technical challenges due to the sensitive character of face data and biases in real training datasets hinder their development. Generative AI addresses privacy by creating fictitious identities, but fairness problems persist. We promote fairness by introducing a demographic attributes balancing mechanism in generated training datasets. We experiment with an existing real dataset, three generated training datasets, and the balanced versions of a diffusion-based dataset. We propose a comprehensive evaluation that considers accuracy and fairness equally and includes a rigorous regression-based statistical analysis of attributes. The analysis shows that balancing reduces demographic unfairness. Also, a performance gap persists despite generation becoming more accurate with time. The proposed balancing method and comprehensive verification evaluation promote fairer and transparent face recognition and verification.
翻译:人脸识别与验证是两项计算机视觉任务,其性能随着深度表征的引入而不断提升。然而,由于人脸数据的敏感性以及真实训练数据集中存在的偏差,伦理、法律和技术层面的挑战阻碍了其发展。生成式人工智能通过创建虚构身份来解决隐私问题,但公平性问题依然存在。我们通过在生成的训练数据集中引入人口属性平衡机制来促进公平性。我们在一个现有真实数据集、三个生成的训练数据集以及一个基于扩散模型的数据集的平衡版本上进行了实验。我们提出了一种综合考虑准确性与公平性的全面评估方法,其中包含基于回归的属性统计分析。分析表明,平衡处理能够减少人口统计上的不公平性。此外,尽管生成技术随时间推移变得更加精确,性能差距仍然存在。所提出的平衡方法和全面的验证评估有助于推动更公平、更透明的人脸识别与验证技术。