Multi-modal Emotion Recognition in Conversation (MERC) has received considerable attention in various fields, e.g., human-computer interaction and recommendation systems. Most existing works perform feature disentanglement and fusion to extract emotional contextual information from multi-modal features and emotion classification. After revisiting the characteristic of MERC, we argue that long-range contextual semantic information should be extracted in the feature disentanglement stage and the inter-modal semantic information consistency should be maximized in the feature fusion stage. Inspired by recent State Space Models (SSMs), Mamba can efficiently model long-distance dependencies. Therefore, in this work, we fully consider the above insights to further improve the performance of MERC. Specifically, on the one hand, in the feature disentanglement stage, we propose a Broad Mamba, which does not rely on a self-attention mechanism for sequence modeling, but uses state space models to compress emotional representation, and utilizes broad learning systems to explore the potential data distribution in broad space. Different from previous SSMs, we design a bidirectional SSM convolution to extract global context information. On the other hand, we design a multi-modal fusion strategy based on probability guidance to maximize the consistency of information between modalities. Experimental results show that the proposed method can overcome the computational and memory limitations of Transformer when modeling long-distance contexts, and has great potential to become a next-generation general architecture in MERC.
翻译:对话中的多模态情感识别(MERC)在人机交互和推荐系统等诸多领域受到了广泛关注。现有研究大多通过特征解耦与融合,从多模态特征中提取情感上下文信息并进行情感分类。在重新审视MERC的特性后,我们认为在特征解耦阶段应提取长距离上下文语义信息,并在特征融合阶段最大化模态间语义信息的一致性。受最新状态空间模型(SSMs)的启发,Mamba能够高效建模长距离依赖关系。因此,本文将充分结合上述见解,进一步提升MERC的性能。具体而言,在特征解耦阶段,我们提出了一种宽Mamba模型,该模型不依赖自注意力机制进行序列建模,而是利用状态空间模型压缩情感表征,并借助宽学习系统探索宽空间中的潜在数据分布。与以往的SSMs不同,我们设计了双向SSM卷积以提取全局上下文信息。另一方面,我们设计了一种基于概率引导的多模态融合策略,以最大化模态间信息的一致性。实验结果表明,所提方法能够克服Transformer在建模长距离上下文时的计算与内存限制,并有望成为MERC的下一代通用架构。