Facial Expression Recognition (FER) is a machine learning problem that deals with recognizing human facial expressions. While existing work has achieved performance improvements in recent years, FER in the wild and under challenging conditions remains a challenge. In this paper, a lightweight patch and attention network based on MobileNetV1, referred to as PAtt-Lite, is proposed to improve FER performance under challenging conditions. A truncated ImageNet-pre-trained MobileNetV1 is utilized as the backbone feature extractor of the proposed method. In place of the truncated layers is a patch extraction block that is proposed for extracting significant local facial features to enhance the representation from MobileNetV1, especially under challenging conditions. An attention classifier is also proposed to improve the learning of these patched feature maps from the extremely lightweight feature extractor. The experimental results on public benchmark databases proved the effectiveness of the proposed method. PAtt-Lite achieved state-of-the-art results on CK+, RAF-DB, FER2013, FERPlus, and the challenging conditions subsets for RAF-DB and FERPlus.
翻译:面部表情识别是一项旨在识别人类面部表情的机器学习任务。尽管现有研究近年来已取得性能提升,但在非受控环境及挑战性条件下的FER仍面临困难。本文提出一种基于MobileNetV1的轻量化局部块与注意力网络(简称PAtt-Lite),以提升挑战性条件下的FER性能。该方法采用截断的ImageNet预训练MobileNetV1作为骨干特征提取器,并通过新设计的局部块提取模块替代被截断的层级,用于提取关键的面部局部特征以增强MobileNetV1的表征能力,尤其在挑战性条件下表现显著。同时提出注意力分类器,以优化从极轻量特征提取器中获得的局部特征图的学习过程。在公开基准数据集上的实验结果验证了所提方法的有效性。PAtt-Lite在CK+、RAF-DB、FER2013、FERPlus数据集,以及RAF-DB与FERPlus的挑战性条件子集上均取得了最先进的性能。