Understanding human affective behaviour, especially in the dynamics of real-world settings, requires Facial Expression Recognition (FER) models to continuously adapt to individual differences in user expression, contextual attributions, and the environment. Current (deep) Machine Learning (ML)-based FER approaches pre-trained in isolation on benchmark datasets fail to capture the nuances of real-world interactions where data is available only incrementally, acquired by the agent or robot during interactions. New learning comes at the cost of previous knowledge, resulting in catastrophic forgetting. Lifelong or Continual Learning (CL), on the other hand, enables adaptability in agents by being sensitive to changing data distributions, integrating new information without interfering with previously learnt knowledge. Positing CL as an effective learning paradigm for FER, this work presents the Continual Facial Expression Recognition (ConFER) benchmark that evaluates popular CL techniques on FER tasks. It presents a comparative analysis of several CL-based approaches on popular FER datasets such as CK+, RAF-DB, and AffectNet and present strategies for a successful implementation of ConFER for Affective Computing (AC) research. CL techniques, under different learning settings, are shown to achieve state-of-the-art (SOTA) performance across several datasets, thus motivating a discussion on the benefits of applying CL principles towards human behaviour understanding, particularly from facial expressions, as well the challenges entailed.
翻译:理解人类情感行为,尤其是在现实世界动态环境中的情感行为,要求面部表情识别(FER)模型持续适应用户表情的个体差异、情境归因以及环境变化。当前的(深度)机器学习(ML)基于FER方法在基准数据集上孤立预训练,无法捕捉现实世界交互中的细微差别——这类交互中数据仅由智能体或机器人在互动过程中增量获取。新知识的学习以遗忘旧知识为代价,导致灾难性遗忘。相反,终身学习或持续学习(CL)通过感知变化的数据分布,在不干扰已学知识的前提下整合新信息,从而增强智能体的适应性。将CL定位为FER的有效学习范式,本工作提出连续面部表情识别(ConFER)基准,用于评估流行CL技术在FER任务上的表现。本文对多个基于CL的方法在CK+、RAF-DB和AffectNet等流行FER数据集上进行对比分析,并提出成功实施ConFER以应用于情感计算(AC)研究的策略。不同学习设置下的CL技术被证明能在多个数据集上实现最先进(SOTA)性能,从而引发关于将CL原则应用于人类行为理解(特别是基于面部表情)的益处及挑战的讨论。