Facial Expression Recognition (FER) is a machine learning problem that deals with recognizing human facial expressions. While existing work has achieved performance improvements in recent years, FER in the wild and under challenging conditions remains a challenge. In this paper, a lightweight patch and attention network based on MobileNetV1, referred to as PAtt-Lite, is proposed to improve FER performance under challenging conditions. A truncated ImageNet-pre-trained MobileNetV1 is utilized as the backbone feature extractor of the proposed method. In place of the truncated layers is a patch extraction block that is proposed for extracting significant local facial features to enhance the representation from MobileNetV1, especially under challenging conditions. An attention classifier is also proposed to improve the learning of these patched feature maps from the extremely lightweight feature extractor. The experimental results on public benchmark databases proved the effectiveness of the proposed method. PAtt-Lite achieved state-of-the-art results on CK+, RAF-DB, FER2013, FERPlus, and the challenging conditions subsets for RAF-DB and FERPlus. The source code for the proposed method will be available at https://github.com/JLREx/PAtt-Lite.
翻译:面部表情识别是一项处理人类面部表情识别的机器学习问题。尽管近年来的研究在性能上取得了提升,但在自然场景及挑战性条件下的表情识别仍是一大难题。本文提出一种基于MobileNetV1的轻量级补丁与注意力网络(简称PAtt-Lite),旨在提升挑战性条件下的表情识别性能。该方法采用截断的ImageNet预训练MobileNetV1作为骨干特征提取器,并在截断层后引入补丁提取模块,用于提取显著的局部面部特征以增强MobileNetV1的表征能力(尤其在挑战性条件下)。此外,还提出一种注意力分类器,以改善从极度轻量级特征提取器中获得的补丁特征图的学习效果。在公开基准数据库上的实验结果证明了所提方法的有效性。PAtt-Lite在CK+、RAF-DB、FER2013、FERPlus以及RAF-DB和FERPlus的挑战性子集上均取得了最先进的结果。所提方法的源代码将发布于https://github.com/JLREx/PAtt-Lite。