Diversity of the features extracted by deep neural networks is important for enhancing the model generalization ability and accordingly its performance in different learning tasks. Facial expression recognition in the wild has attracted interest in recent years due to the challenges existing in this area for extracting discriminative and informative features from occluded images in real-world scenarios. In this paper, we propose a mechanism to diversify the features extracted by CNN layers of state-of-the-art facial expression recognition architectures for enhancing the model capacity in learning discriminative features. To evaluate the effectiveness of the proposed approach, we incorporate this mechanism in two state-of-the-art models to (i) diversify local/global features in an attention-based model and (ii) diversify features extracted by different learners in an ensemble-based model. Experimental results on three well-known facial expression recognition in-the-wild datasets, AffectNet, FER+, and RAF-DB, show the effectiveness of our method, achieving the state-of-the-art performance of 89.99% on RAF-DB, 89.34% on FER+ and the competitive accuracy of 60.02% on AffectNet dataset.
翻译:深度神经网络提取特征的多样性对于提升模型泛化能力及其在不同学习任务中的性能至关重要。野外场景下的面部表情识别近年来备受关注,其挑战在于需要从真实场景的遮挡图像中提取具有判别性和信息量的特征。本文提出了一种机制,用于多样化当前最优面部表情识别架构中CNN层所提取的特征,以增强模型学习判别性特征的能力。为评估该方法的有效性,我们将此机制集成到两个最优模型中:(i) 在基于注意力机制的模型中多样化局部/全局特征;(ii) 在基于集成学习的模型中多样化不同学习器提取的特征。在三个著名的野外面部表情识别数据集(AffectNet、FER+和RAF-DB)上的实验结果表明了本方法的有效性:在RAF-DB上达到89.99%的当前最优性能,在FER+上达到89.34%,在AffectNet上获得具有竞争力的60.02%准确率。