The integration of artificial intelligence (AI) into healthcare has advanced significantly, yet affect recognition remains a major challenge, particularly in AI-assisted interventions such as Computerized Cognitive Training (CCT). The THERADIA-WoZ corpus was developed to enable multimodal affect recognition in the context of AI-driven CCT, focusing on an older adult population. This study extends the corpus by introducing a dataset collected from young adults, allowing direct comparison of affect recognition models across age groups. Our objective was to assess whether multimodal models based on dimensions borrowed from appraisal theories outperform those based on categorical labels and to evaluate their generalisation power across age corpora. After comparing both corpora, models were trained and tested using within-corpus, cross-corpus, and mixed-corpus evaluation. Results revealed that appraisal dimensions consistently outperformed categorical labels across all conditions, demonstrating greater predictive accuracy and stability. Notably, categorical labels failed to generalise across age corpora, as performance dropped to chance levels in cross-corpus evaluation. In contrast, appraisal dimensions maintained predictive performance above chance, reinforcing their robustness for cross-age affect recognition. Furthermore, training on both corpora did not improve generalisation beyond within-corpus training. The findings support the theoretical and practical advantages of appraisal dimensions over categorical labels in affective computing. They also highlight the importance of multimodal fusion and deep learning representations for emotion modeling. To facilitate future research, we provide an API for researchers interested in time-continuous emotion prediction, offering valuable tools for behavioral sciences to enhance the measurement of emotional states in various experimental settings.
翻译:人工智能(AI)在医疗领域的整合取得了显著进展,但情感识别仍是一大挑战,尤其在计算机化认知训练(CCT)等AI辅助干预措施中。为支持AI驱动的CCT背景下的多模态情感识别,研究者构建了THERADIA-WoZ语料库,重点关注老年人群。本研究通过引入年轻人数据集扩展该语料库,从而可直接比较不同年龄段的情感识别模型。我们的目标是评估基于评价理论维度的多模态模型是否优于基于分类标签的模型,并检验其在跨年龄语料库中的泛化能力。在对比两个语料库后,采用语料库内、跨语料库及混合语料库评估方法对模型进行训练和测试。结果显示,在所有条件下,评价维度均一致优于分类标签,表现出更高的预测准确性和稳定性。值得注意的是,分类标签无法在跨年龄语料库中泛化——其在跨语料库评估中的性能降至随机水平;而评价维度则保持高于随机水平的预测性能,凸显其在跨年龄情感识别中的鲁棒性。此外,在两个语料库上联合训练并未带来比单一语料库内训练更优的泛化效果。这些发现支持了情感计算中评价维度相对于分类标签的理论与实践优势,同时强调多模态融合与深度学习表示对情感建模的重要性。为促进后续研究,我们为感兴趣于连续时间情感预测的研究者提供API接口,为行为科学在不同实验场景中增强情绪状态测量提供宝贵工具。