Multimodal affective computing (MAC) has garnered increasing attention due to its broad applications in analyzing human behaviors and intentions, especially in text-dominated multimodal affective computing field. This survey presents the recent trends of multimodal affective computing from NLP perspective through four hot tasks: multimodal sentiment analysis, multimodal emotion recognition in conversation, multimodal aspect-based sentiment analysis and multimodal multi-label emotion recognition. The goal of this survey is to explore the current landscape of multimodal affective research, identify development trends, and highlight the similarities and differences across various tasks, offering a comprehensive report on the recent progress in multimodal affective computing from an NLP perspective. This survey covers the formalization of tasks, provides an overview of relevant works, describes benchmark datasets, and details the evaluation metrics for each task. Additionally, it briefly discusses research in multimodal affective computing involving facial expressions, acoustic signals, physiological signals, and emotion causes. Additionally, we discuss the technical approaches, challenges, and future directions in multimodal affective computing. To support further research, we released a repository that compiles related works in multimodal affective computing, providing detailed resources and references for the community.
翻译:多模态情感计算(MAC)因其在分析人类行为与意图中的广泛应用而日益受到关注,尤其是在文本主导的多模态情感计算领域。本综述从自然语言处理(NLP)的视角,通过四个热点任务——多模态情感分析、对话中的多模态情感识别、多模态基于方面的情感分析以及多模态多标签情感识别——来呈现多模态情感计算的最新趋势。本综述的目标是探索多模态情感研究的当前格局,识别发展趋势,并强调不同任务之间的共性与差异,从而提供一份从NLP视角出发的、关于多模态情感计算近期进展的全面报告。本综述涵盖了任务的形式化定义,概述了相关研究工作,描述了基准数据集,并详细说明了每个任务的评估指标。此外,本文还简要讨论了涉及面部表情、声学信号、生理信号及情感诱因的多模态情感计算研究。同时,我们探讨了多模态情感计算的技术方法、挑战与未来方向。为支持进一步研究,我们发布了一个汇集多模态情感计算相关工作的资源库,为学界提供了详细的资源与参考文献。