Facial Expression Recognition (FER) is vital for understanding interpersonal communication. However, existing classification methods often face challenges such as vulnerability to noise, imbalanced datasets, overfitting, and generalization issues. In this paper, we propose GCF, a novel approach that utilizes Graph Convolutional Networks for FER. GCF integrates Convolutional Neural Networks (CNNs) for feature extraction, using either custom architectures or pretrained models. The extracted visual features are then represented on a graph, enhancing local CNN features with global features via a Graph Convolutional Neural Network layer. We evaluate GCF on benchmark datasets including CK+, JAFFE, and FERG. The results show that GCF significantly improves performance over state-of-the-art methods. For example, GCF enhances the accuracy of ResNet18 from 92% to 98% on CK+, from 66% to 89% on JAFFE, and from 94% to 100% on FERG. Similarly, GCF improves the accuracy of VGG16 from 89% to 97% on CK+, from 72% to 92% on JAFFE, and from 96% to 99.49% on FERG. We provide a comprehensive analysis of our approach, demonstrating its effectiveness in capturing nuanced facial expressions. By integrating graph convolutions with CNNs, GCF significantly advances FER, offering improved accuracy and robustness in real-world applications.
翻译:面部表情识别对于理解人际交流至关重要。然而,现有分类方法常面临噪声敏感、数据集不平衡、过拟合及泛化能力不足等挑战。本文提出GCF,一种利用图卷积网络进行面部表情识别的新方法。GCF集成卷积神经网络进行特征提取,可采用定制架构或预训练模型。提取的视觉特征随后在图结构上表示,通过图卷积神经网络层将局部CNN特征与全局特征相融合。我们在CK+、JAFFE和FERG等基准数据集上评估GCF。结果表明,GCF较现有最优方法显著提升了性能。例如在CK+数据集上,GCF将ResNet18的准确率从92%提升至98%;在JAFFE上从66%提升至89%;在FERG上从94%提升至100%。同样地,GCF将VGG16在CK+上的准确率从89%提升至97%,在JAFFE上从72%提升至92%,在FERG上从96%提升至99.49%。我们对该方法进行了全面分析,证明其能有效捕捉细微的面部表情特征。通过将图卷积与CNN相结合,GCF显著推进了面部表情识别技术,在实际应用中提供了更高的准确性与鲁棒性。