Complex Facial Expression Recognition Using Deep Knowledge Distillation of Basic Features

Complex emotion recognition is a cognitive task that has so far eluded the same excellent performance of other tasks that are at or above the level of human cognition. Emotion recognition through facial expressions is particularly difficult due to the complexity of emotions expressed by the human face. For a machine to approach the same level of performance in complex facial expression recognition as a human, it may need to synthesise knowledge and understand new concepts in real-time, as humans do. Humans are able to learn new concepts using only few examples by distilling important information from memories. Inspired by human cognition and learning, we propose a novel continual learning method for complex facial expression recognition that can accurately recognise new compound expression classes using few training samples, by building on and retaining its knowledge of basic expression classes. In this work, we also use GradCAM visualisations to demonstrate the relationship between basic and compound facial expressions. Our method leverages this relationship through knowledge distillation and a novel Predictive Sorting Memory Replay, to achieve the current state-of-the-art in continual learning for complex facial expression recognition, with 74.28% Overall Accuracy on new classes. We also demonstrate that using continual learning for complex facial expression recognition achieves far better performance than non-continual learning methods, improving on state-of-the-art non-continual learning methods by 13.95%. Our work is also the first to apply few-shot learning to complex facial expression recognition, achieving the state-of-the-art with 100% accuracy using only a single training sample per class.

翻译：复杂情绪识别是一项认知任务，迄今未能达到与其他已接近或超越人类认知水平的任务同等的优异性能。由于人类面部所表达情绪的复杂性，通过面部表情进行情绪识别尤为困难。若要使机器在复杂面部表情识别中达到与人类相当的性能，它可能需要像人类一样实时综合知识并理解新概念。人类能够通过从记忆中蒸馏重要信息，仅用少量样本学习新概念。受人类认知与学习的启发，我们提出了一种新颖的复杂面部表情识别持续学习方法，该方法能够在保留基础表情类别知识的基础上，利用少量训练样本准确识别新的复合表情类别。在本研究中，我们还利用GradCAM可视化方法展示了基础表情与复合表情之间的关系。通过知识蒸馏及新颖的预测排序记忆回放机制，我们的方法充分利用了这种关系，在复杂面部表情识别的持续学习任务中达到了当前最优水平，在新类别上的总体准确率为74.28%。我们还证明，使用持续学习进行复杂面部表情识别所获得的性能远超非持续学习方法，相比当前最优的非持续学习方法提升了13.95%。此外，本研究首次将少样本学习应用于复杂面部表情识别，在每类仅使用单个训练样本的情况下，以100%的准确率达到了最优水平。