Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data

Ivan DeAndres-Tame,Ruben Tolosana,Pietro Melzi,Ruben Vera-Rodriguez,Minchul Kim,Christian Rathgeb,Xiaoming Liu,Luis F. Gomez,Aythami Morales,Julian Fierrez,Javier Ortega-Garcia,Zhizhou Zhong,Yuge Huang,Yuxi Mi,Shouhong Ding,Shuigeng Zhou,Shuai He,Lingzhi Fu,Heng Cong,Rongyu Zhang,Zhihong Xiao,Evgeny Smirnov,Anton Pimenov,Aleksei Grigorev,Denis Timoshenko,Kaleb Mesfin Asfaw,Cheng Yaw Low,Hao Liu,Chuyi Wang,Qing Zuo,Zhixiang He,Hatef Otroshi Shahreza,Anjith George,Alexander Unnervik,Parsa Rahimi,Sébastien Marcel,Pedro C. Neto,Marco Huber,Jan Niklas Kolf,Naser Damer,Fadi Boutros,Jaime S. Cardoso,Ana F. Sequeira,Andrea Atzori,Gianni Fenu,Mirko Marras,Vitomir Štruc,Jiang Yu,Zhangjie Li,Jichun Li,Weisong Zhao,Zhen Lei,Xiangyu Zhu,Xiao-Yu Zhang,Bernardo Biesseck,Pedro Vidal,Luiz Coelho,Roger Granada,David Menotti

Synthetic data is gaining increasing popularity for face recognition technologies, mainly due to the privacy concerns and challenges associated with obtaining real data, including diverse scenarios, quality, and demographic groups, among others. It also offers some advantages over real data, such as the large amount of data that can be generated or the ability to customize it to adapt to specific problem-solving needs. To effectively use such data, face recognition models should also be specifically designed to exploit synthetic data to its fullest potential. In order to promote the proposal of novel Generative AI methods and synthetic data, and investigate the application of synthetic data to better train face recognition systems, we introduce the 2nd FRCSyn-onGoing challenge, based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024. This is an ongoing challenge that provides researchers with an accessible platform to benchmark i) the proposal of novel Generative AI methods and synthetic data, and ii) novel face recognition systems that are specifically proposed to take advantage of synthetic data. We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition such as demographic bias, domain adaptation, and performance constraints in demanding situations, such as age disparities between training and testing, changes in the pose, or occlusions. Very interesting findings are obtained in this second edition, including a direct comparison with the first one, in which synthetic databases were restricted to DCFace and GANDiffFace.

翻译：合成数据在人脸识别技术中日益受到青睐，这主要归因于隐私顾虑以及获取真实数据所面临的挑战，包括多样化的场景、质量及人口统计群体等因素。相较于真实数据，合成数据还具备一些优势，例如能够生成海量数据，或可根据特定问题解决需求进行定制化生成。为有效利用此类数据，人脸识别模型亦需专门设计，以充分发挥合成数据的潜力。为促进新型生成式人工智能方法与合成数据的提出，并探索利用合成数据更有效地训练人脸识别系统，我们基于最初在CVPR 2024发布的第二届合成数据时代人脸识别挑战赛（FRCSyn），推出了第二届FRCSyn-onGoing挑战赛。这是一项持续进行的挑战，为研究者提供了一个易用的平台，用以评估：i）新型生成式人工智能方法与合成数据的提出；ii）专门为利用合成数据优势而设计的新型人脸识别系统。我们重点探索合成数据的独立使用及其与真实数据的结合使用，以应对当前人脸识别领域的挑战，例如人口统计偏差、域适应问题，以及在训练与测试存在年龄差异、姿态变化或遮挡等严苛场景下的性能限制。本届挑战赛取得了极具价值的发现，包括与首届挑战赛的直接对比——在首届赛事中，合成数据库仅限于DCFace和GANDiffFace。