Synthetic data is gaining increasing relevance for training machine learning models. This is mainly motivated due to several factors such as the lack of real data and intra-class variability, time and errors produced in manual labeling, and in some cases privacy concerns, among others. This paper presents an overview of the 2nd edition of the Face Recognition Challenge in the Era of Synthetic Data (FRCSyn) organized at CVPR 2024. FRCSyn aims to investigate the use of synthetic data in face recognition to address current technological limitations, including data privacy concerns, demographic biases, generalization to novel scenarios, and performance constraints in challenging situations such as aging, pose variations, and occlusions. Unlike the 1st edition, in which synthetic data from DCFace and GANDiffFace methods was only allowed to train face recognition systems, in this 2nd edition we propose new sub-tasks that allow participants to explore novel face generative methods. The outcomes of the 2nd FRCSyn Challenge, along with the proposed experimental protocol and benchmarking contribute significantly to the application of synthetic data to face recognition.
翻译:合成数据在训练机器学习模型中的重要性日益凸显。这主要源于多种因素,包括真实数据缺乏及类内变异不足、手动标注耗时且易出错,以及在某些情况下的隐私问题等。本文概述了在CVPR 2024上举办的第二届合成数据时代人脸识别挑战赛(FRCSyn)的情况。FRCSyn旨在探究合成数据在人脸识别中的应用,以应对当前技术局限,包括数据隐私问题、人口统计偏差、对新场景的泛化能力,以及老龄化、姿态变化和遮挡等挑战性场景下的性能限制。与第一届仅允许使用DCFace和GANDiffFace方法生成的合成数据训练人脸识别系统不同,本届挑战赛提出了新的子任务,允许参与者探索新颖的人脸生成方法。第二届FRCSyn挑战赛的成果,连同所提出的实验方案和基准测试,为合成数据在人脸识别中的应用做出了重要贡献。