Multimodal Sentiment Analysis (MSA) utilizes multimodal data to infer the users' sentiment. Previous methods focus on equally treating the contribution of each modality or statically using text as the dominant modality to conduct interaction, which neglects the situation where each modality may become dominant. In this paper, we propose a Knowledge-Guided Dynamic Modality Attention Fusion Framework (KuDA) for multimodal sentiment analysis. KuDA uses sentiment knowledge to guide the model dynamically selecting the dominant modality and adjusting the contributions of each modality. In addition, with the obtained multimodal representation, the model can further highlight the contribution of dominant modality through the correlation evaluation loss. Extensive experiments on four MSA benchmark datasets indicate that KuDA achieves state-of-the-art performance and is able to adapt to different scenarios of dominant modality.
翻译:多模态情感分析(MSA)利用多模态数据推断用户情感。先前的方法侧重于平等对待各模态的贡献或静态地将文本作为主导模态进行交互,忽略了各模态可能成为主导模态的情况。本文提出一种知识引导的动态模态注意力融合框架(KuDA)用于多模态情感分析。KuDA利用情感知识引导模型动态选择主导模态并调整各模态的贡献度。此外,基于获得的多模态表示,模型可通过相关性评估损失进一步突出主导模态的贡献。在四个MSA基准数据集上的大量实验表明,KuDA实现了最先进的性能,并能适应不同主导模态场景。