Multimodal Aspect-based Sentiment Analysis (MABSA) is a fine-grained Sentiment Analysis task, which has attracted growing research interests recently. Existing work mainly utilizes image information to improve the performance of MABSA task. However, most of the studies overestimate the importance of images since there are many noise images unrelated to the text in the dataset, which will have a negative impact on model learning. Although some work attempts to filter low-quality noise images by setting thresholds, relying on thresholds will inevitably filter out a lot of useful image information. Therefore, in this work, we focus on whether the negative impact of noisy images can be reduced without modifying the data. To achieve this goal, we borrow the idea of Curriculum Learning and propose a Multi-grained Multi-curriculum Denoising Framework (M2DF), which can achieve denoising by adjusting the order of training data. Extensive experimental results show that our framework consistently outperforms state-of-the-art work on three sub-tasks of MABSA.
翻译:多模态方面级情感分析(MABSA)是一项细粒度的情感分析任务,近年来受到越来越多的研究关注。现有工作主要利用图像信息来提升MABSA任务的性能。然而,由于数据集中存在大量与文本无关的噪声图像,多数研究高估了图像的重要性,这将对模型学习产生负面影响。尽管部分研究尝试通过设置阈值来过滤低质量噪声图像,但依赖阈值方法不可避免会滤除大量有用的图像信息。因此,本研究聚焦于能否在不修改数据的前提下减少噪声图像的负面影响。为实现这一目标,我们借鉴课程学习的思想,提出了一种多粒度多课程去噪框架(M2DF),该方法通过调整训练数据的顺序实现去噪。大量实验结果表明,该框架在MABSA的三个子任务上始终优于现有最优工作。