Facial expressions play a crucial role in human communication serving as a powerful and impactful means to express a wide range of emotions. With advancements in artificial intelligence and computer vision, deep neural networks have emerged as effective tools for facial emotion recognition. In this paper, we propose EmoNeXt, a novel deep learning framework for facial expression recognition based on an adapted ConvNeXt architecture network. We integrate a Spatial Transformer Network (STN) to focus on feature-rich regions of the face and Squeeze-and-Excitation blocks to capture channel-wise dependencies. Moreover, we introduce a self-attention regularization term, encouraging the model to generate compact feature vectors. We demonstrate the superiority of our model over existing state-of-the-art deep learning models on the FER2013 dataset regarding emotion classification accuracy.
翻译:面部表情在人类交流中扮演着关键角色,是表达广泛情感的一种强有力且极具影响力的方式。随着人工智能和计算机视觉的进步,深度神经网络已成为面部表情识别的有效工具。本文提出EmoNeXt,一种基于改进型ConvNeXt架构的新型深度学习框架,用于面部表情识别。我们集成了空间变换网络(STN)以聚焦于人脸的特征丰富区域,并采用挤压-激励模块以捕获通道间的依赖关系。此外,我们引入了一种自注意力正则化项,促使模型生成紧凑的特征向量。我们在FER2013数据集上验证了所提模型在情感分类准确率方面优于现有最先进的深度学习模型。