FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State Space

Facial Expression Recognition (FER) plays a pivotal role in understanding human emotional cues. However, traditional FER methods based on visual information have some limitations, such as preprocessing, feature extraction, and multi-stage classification procedures. These not only increase computational complexity but also require a significant amount of computing resources. Considering Convolutional Neural Network (CNN)-based FER schemes frequently prove inadequate in identifying the deep, long-distance dependencies embedded within facial expression images, and the Transformer's inherent quadratic computational complexity, this paper presents the FER-YOLO-Mamba model, which integrates the principles of Mamba and YOLO technologies to facilitate efficient coordination in facial expression image recognition and localization. Within the FER-YOLO-Mamba model, we further devise a FER-YOLO-VSS dual-branch module, which combines the inherent strengths of convolutional layers in local feature extraction with the exceptional capability of State Space Models (SSMs) in revealing long-distance dependencies. To the best of our knowledge, this is the first Vision Mamba model designed for facial expression detection and classification. To evaluate the performance of the proposed FER-YOLO-Mamba model, we conducted experiments on two benchmark datasets, RAF-DB and SFEW. The experimental results indicate that the FER-YOLO-Mamba model achieved better results compared to other models. The code is available from https://github.com/SwjtuMa/FER-YOLO-Mamba.

翻译：人脸表情识别在理解人类情感线索中发挥着关键作用。然而，传统的基于视觉信息的人脸表情识别方法存在局限性，例如预处理、特征提取和多阶段分类流程。这些步骤不仅增加了计算复杂度，还消耗大量计算资源。考虑到基于卷积神经网络的人脸表情识别方案在识别面部表情图像中嵌入的深层长距离依赖关系时往往表现不足，且Transformer固有的二次计算复杂度问题，本文提出了FER-YOLO-Mamba模型，该模型整合了Mamba与YOLO技术的原理，以实现人脸表情图像识别与定位的高效协同。在FER-YOLO-Mamba模型中，我们进一步设计了FER-YOLO-VSS双分支模块，该模块结合了卷积层在局部特征提取方面的固有优势与状态空间模型在揭示长距离依赖关系方面的卓越能力。据我们所知，这是首个专为人脸表情检测与分类设计的Vision Mamba模型。为了评估所提出的FER-YOLO-Mamba模型的性能，我们在两个基准数据集RAF-DB和SFEW上进行了实验。实验结果表明，FER-YOLO-Mamba模型相较其他模型取得了更优的结果。代码可从https://github.com/SwjtuMa/FER-YOLO-Mamba获取。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日