The field of Multimodal Sentiment Analysis (MSA) has recently witnessed an emerging direction seeking to tackle the issue of data incompleteness. Recognizing that the language modality typically contains dense sentiment information, we consider it as the dominant modality and present an innovative Language-dominated Noise-resistant Learning Network (LNLN) to achieve robust MSA. The proposed LNLN features a dominant modality correction (DMC) module and dominant modality based multimodal learning (DMML) module, which enhances the model's robustness across various noise scenarios by ensuring the quality of dominant modality representations. Aside from the methodical design, we perform comprehensive experiments under random data missing scenarios, utilizing diverse and meaningful settings on several popular datasets (\textit{e.g.,} MOSI, MOSEI, and SIMS), providing additional uniformity, transparency, and fairness compared to existing evaluations in the literature. Empirically, LNLN consistently outperforms existing baselines, demonstrating superior performance across these challenging and extensive evaluation metrics.
翻译:多模态情感分析(MSA)领域近期出现了一个新兴研究方向,旨在解决数据不完整问题。鉴于语言模态通常包含密集的情感信息,我们将其视为主导模态,并提出一种创新的语言主导抗噪学习网络(LNLN)以实现鲁棒的MSA。所提出的LNLN包含主导模态校正(DMC)模块和基于主导模态的多模态学习(DMML)模块,通过确保主导模态表征的质量,增强了模型在各种噪声场景下的鲁棒性。除方法设计外,我们在随机数据缺失场景下进行了全面实验,在多个常用数据集(如MOSI、MOSEI和SIMS)上采用多样化且具有实际意义的设置,与现有文献中的评估相比,提供了更高的统一性、透明度和公平性。实证结果表明,LNLN在各项具有挑战性的广泛评估指标上均持续优于现有基线方法,展现出卓越的性能。