Despite multimodal sentiment analysis being a fertile research ground that merits further investigation, current approaches take up high annotation cost and suffer from label ambiguity, non-amicable to high-quality labeled data acquisition. Furthermore, choosing the right interactions is essential because the significance of intra- or inter-modal interactions can differ among various samples. To this end, we propose Semi-IIN, a Semi-supervised Intra-inter modal Interaction learning Network for multimodal sentiment analysis. Semi-IIN integrates masked attention and gating mechanisms, enabling effective dynamic selection after independently capturing intra- and inter-modal interactive information. Combined with the self-training approach, Semi-IIN fully utilizes the knowledge learned from unlabeled data. Experimental results on two public datasets, MOSI and MOSEI, demonstrate the effectiveness of Semi-IIN, establishing a new state-of-the-art on several metrics. Code is available at https://github.com/flow-ljh/Semi-IIN.
翻译:尽管多模态情感分析是一个值得深入探索的肥沃研究领域,但现有方法通常标注成本高昂,且易受标签模糊性影响,不利于获取高质量的标注数据。此外,由于模态内与模态间交互的重要性在不同样本中可能存在差异,选择合适的交互机制至关重要。为此,我们提出Semi-IIN——一种用于多模态情感分析的半监督模态内-模态间交互学习网络。Semi-IIN融合了掩码注意力与门控机制,在独立捕获模态内和模态间交互信息后,能够实现有效的动态选择。结合自训练方法,Semi-IIN充分利用了从未标注数据中学到的知识。在MOSI和MOSEI两个公开数据集上的实验结果表明,Semi-IIN具有显著有效性,并在多项指标上达到了新的最优性能。代码公开于https://github.com/flow-ljh/Semi-IIN。