Face recognition has made remarkable strides, driven by the expanding scale of datasets, advancements in various backbone and discriminative losses. However, face recognition performance is heavily affected by the label noise, especially closed-set noise. While numerous studies have focused on handling label noise, addressing closed-set noise still poses challenges. This paper identifies this challenge as training isn't robust to noise at the early-stage training, and necessitating an appropriate learning strategy for samples with low confidence, which are often misclassified as closed-set noise in later training phases. To address these issues, we propose a new framework to stabilize the training at early stages and split the samples into clean, ambiguous and noisy groups which are devised with separate training strategies. Initially, we employ generated auxiliary closed-set noisy samples to enable the model to identify noisy data at the early stages of training. Subsequently, we introduce how samples are split into clean, ambiguous and noisy groups by their similarity to the positive and nearest negative centers. Then we perform label fusion for ambiguous samples by incorporating accumulated model predictions. Finally, we apply label smoothing within the closed set, adjusting the label to a point between the nearest negative class and the initially assigned label. Extensive experiments validate the effectiveness of our method on mainstream face datasets, achieving state-of-the-art results. The code will be released upon acceptance.
翻译:人脸识别在数据集规模扩大、多种骨干网络和判别性损失函数进步的推动下取得了显著进展。然而,人脸识别性能深受标签噪声影响,尤其是闭集噪声。尽管已有大量研究关注标签噪声处理,但应对闭集噪声仍存在挑战。本文指出该挑战源于训练早期阶段对噪声缺乏鲁棒性,且需要对低置信度样本采用适宜的学习策略——这些样本在训练后期常被误判为闭集噪声。为解决这些问题,我们提出一种新框架以稳定早期训练,并将样本划分为干净、模糊和噪声三组,针对各组设计独立训练策略。首先,我们利用生成的辅助闭集噪声样本使模型能在训练早期识别噪声数据。随后,我们介绍如何通过样本与正类中心及最近负类中心的相似度进行样本划分。接着,我们通过融合累积模型预测对模糊样本执行标签融合。最后,我们在闭集内实施标签平滑,将标签调整至最近负类与初始分配标签之间的平衡点。大量实验在主流人脸数据集上验证了本方法的有效性,取得了最先进的性能结果。代码将在论文录用后开源。