Emotion recognition in conversation (ERC) is a crucial task in natural language processing and affective computing. This paper proposes MultiDAG+CL, a novel approach for Multimodal Emotion Recognition in Conversation (ERC) that employs Directed Acyclic Graph (DAG) to integrate textual, acoustic, and visual features within a unified framework. The model is enhanced by Curriculum Learning (CL) to address challenges related to emotional shifts and data imbalance. Curriculum learning facilitates the learning process by gradually presenting training samples in a meaningful order, thereby improving the model's performance in handling emotional variations and data imbalance. Experimental results on the IEMOCAP and MELD datasets demonstrate that the MultiDAG+CL models outperform baseline models. We release the code for MultiDAG+CL and experiments: https://github.com/vanntc711/MultiDAG-CL
翻译:对话情感识别(ERC)是自然语言处理和情感计算中的关键任务。本文提出MultiDAG+CL,一种面向对话多模态情感识别(ERC)的新型方法,该方法采用有向无环图(DAG)在统一框架内整合文本、声学和视觉特征。该模型通过课程学习(CL)增强,以应对情感转移和数据不平衡带来的挑战。课程学习通过按有意义顺序逐步呈现训练样本来促进学习过程,从而提升模型在处理情感变化和数据不平衡方面的性能。在IEMOCAP和MELD数据集上的实验结果表明,MultiDAG+CL模型优于基线模型。我们公开了MultiDAG+CL及其实验代码:https://github.com/vanntc711/MultiDAG-CL