Emotion recognition is a complex task due to the inherent subjectivity in both the perception and production of emotions. The subjectivity of emotions poses significant challenges in developing accurate and robust computational models. This thesis examines critical facets of emotion recognition, beginning with the collection of diverse datasets that account for psychological factors in emotion production. To handle the challenge of non-representative training data, this work collects the Multimodal Stressed Emotion dataset, which introduces controlled stressors during data collection to better represent real-world influences on emotion production. To address issues with label subjectivity, this research comprehensively analyzes how data augmentation techniques and annotation schemes impact emotion perception and annotator labels. It further handles natural confounding variables and variations by employing adversarial networks to isolate key factors like stress from learned emotion representations during model training. For tackling concerns about leakage of sensitive demographic variables, this work leverages adversarial learning to strip sensitive demographic information from multimodal encodings. Additionally, it proposes optimized sociological evaluation metrics aligned with cost-effective, real-world needs for model testing. This research advances robust, practical emotion recognition through multifaceted studies of challenges in datasets, labels, modeling, demographic and membership variable encoding in representations, and evaluation. The groundwork has been laid for cost-effective, generalizable emotion recognition models that are less likely to encode sensitive demographic information.
翻译:情感识别是一项复杂的任务,原因在于情感感知和产生过程中固有的主观性。情感的主观性给开发准确且鲁棒的计算模型带来了重大挑战。本论文探讨了情感识别的关键方面,首先从收集考虑情感产生中心理因素的多样化数据集入手。为应对非代表性训练数据的挑战,本研究收集了多模态应激情感数据集,该数据集在数据收集过程中引入可控应激源,以更好地反映现实世界对情感产生的影响。为解决标签主观性问题,本研究全面分析了数据增强技术和注释方案如何影响情感感知与标注者标签。此外,通过采用对抗网络在模型训练过程中从学习到的情感表示中分离压力等关键因素,本研究进一步处理了自然混杂变量和变化。针对敏感人口统计变量泄露的担忧,本研究利用对抗学习从多模态编码中去除敏感人口统计信息。同时,提出与成本效益高且符合现实世界需求的模型测试相一致的优化社会学评估指标。本研究通过对数据集、标签、建模、表示中的人口统计和成员变量编码以及评估等多方面挑战的综合研究,推动了鲁棒且实用的情感识别发展。为构建成本效益高、泛化能力强且不易编码敏感人口统计信息的情感识别模型奠定了基础。