Navigating Label Ambiguity for Facial Expression Recognition in the Wild

Facial expression recognition (FER) remains a challenging task due to label ambiguity caused by the subjective nature of facial expressions and noisy samples. Additionally, class imbalance, which is common in real-world datasets, further complicates FER. Although many studies have shown impressive improvements, they typically address only one of these issues, leading to suboptimal results. To tackle both challenges simultaneously, we propose a novel framework called Navigating Label Ambiguity (NLA), which is robust under real-world conditions. The motivation behind NLA is that dynamically estimating and emphasizing ambiguous samples at each iteration helps mitigate noise and class imbalance by reducing the model's bias toward majority classes. To achieve this, NLA consists of two main components: Noise-aware Adaptive Weighting (NAW) and consistency regularization. Specifically, NAW adaptively assigns higher importance to ambiguous samples and lower importance to noisy ones, based on the correlation between the intermediate prediction scores for the ground truth and the nearest negative. Moreover, we incorporate a regularization term to ensure consistent latent distributions. Consequently, NLA enables the model to progressively focus on more challenging ambiguous samples, which primarily belong to the minority class, in the later stages of training. Extensive experiments demonstrate that NLA outperforms existing methods in both overall and mean accuracy, confirming its robustness against noise and class imbalance. To the best of our knowledge, this is the first framework to address both problems simultaneously.

翻译：面部表情识别（FER）因面部表情的主观性及噪声样本导致的标签模糊性，始终是一项具有挑战性的任务。此外，真实世界数据集中常见的类别不平衡问题进一步加剧了FER的难度。尽管已有许多研究取得了显著改进，但它们通常仅针对其中一个问题进行处理，导致结果不尽如人意。为同时应对这两项挑战，我们提出了一种名为“应对标签模糊性”（Navigating Label Ambiguity, NLA）的新型框架，该框架在真实世界条件下具有鲁棒性。NLA的核心动机在于：通过在每次迭代中动态估计并强调模糊样本，可以减少模型对多数类别的偏好，从而缓解噪声和类别不平衡的影响。为实现这一目标，NLA包含两个主要组件：噪声感知自适应加权（Noise-aware Adaptive Weighting, NAW）和一致性正则化。具体而言，NAW基于真实标签与最近负类之间的中间预测分数相关性，自适应地为模糊样本分配更高的重要性，并为噪声样本分配较低的重要性。此外，我们引入了一个正则化项以确保潜在分布的一致性。因此，NLA使模型能够在训练后期逐步聚焦于更具挑战性的模糊样本，这些样本主要属于少数类别。大量实验表明，NLA在整体准确率和平均准确率上均优于现有方法，证实了其对抗噪声和类别不平衡的鲁棒性。据我们所知，这是首个同时解决这两个问题的框架。