ReSup: Reliable Label Noise Suppression for Facial Expression Recognition

Because of the ambiguous and subjective property of the facial expression recognition (FER) task, the label noise is widely existing in the FER dataset. For this problem, in the training phase, current FER methods often directly predict whether the label of the input image is noised or not, aiming to reduce the contribution of the noised data in training. However, we argue that this kind of method suffers from the low reliability of such noise data decision operation. It makes that some mistakenly abounded clean data are not utilized sufficiently and some mistakenly kept noised data disturbing the model learning process. In this paper, we propose a more reliable noise-label suppression method called ReSup (Reliable label noise Suppression for FER). First, instead of directly predicting noised or not, ReSup makes the noise data decision by modeling the distribution of noise and clean labels simultaneously according to the disagreement between the prediction and the target. Specifically, to achieve optimal distribution modeling, ReSup models the similarity distribution of all samples. To further enhance the reliability of our noise decision results, ReSup uses two networks to jointly achieve noise suppression. Specifically, ReSup utilize the property that two networks are less likely to make the same mistakes, making two networks swap decisions and tending to trust decisions with high agreement. Extensive experiments on three popular benchmarks show that the proposed method significantly outperforms state-of-the-art noisy label FER methods by 3.01% on FERPlus becnmarks. Code: https://github.com/purpleleaves007/FERDenoise

翻译：由于面部表情识别（FER）任务具有模糊性和主观性，FER数据集中广泛存在标签噪声。针对该问题，当前FER方法在训练阶段通常直接预测输入图像的标签是否包含噪声，旨在降低噪声数据在训练中的贡献。然而，我们认为这类方法存在噪声数据决策操作可靠性低的问题，导致部分被错误舍弃的干净数据未能充分利用，同时部分被错误保留的噪声数据干扰模型学习过程。本文提出一种更可靠的噪声标签抑制方法ReSup（面向FER的可靠标签噪声抑制）。首先，ReSup不直接预测标签是否含噪，而是通过建模预测与目标之间的分歧，同时刻画噪声标签与干净标签的分布，从而进行噪声数据决策。具体而言，为实现最优分布建模，ReSup对所有样本的相似度分布进行建模。为进一步增强噪声决策结果的可靠性，ReSup采用双网络联合实现噪声抑制。具体地，ReSup利用两个网络不易同时犯错的特性，使两网络交换决策结果并倾向于信任高度一致的决策。在三个主流基准数据集上的大量实验表明，所提方法在FERPlus基准上以3.01%的显著优势超越了现有最先进的噪声标签FER方法。代码：https://github.com/purpleleaves007/FERDenoise