Harmful Visual Content Manipulation Matters in Misinformation Detection Under Multimedia Scenarios

Nowadays, the widespread dissemination of misinformation across numerous social media platforms has led to severe negative effects on society. To address this challenge, the automatic detection of misinformation, particularly under multimedia scenarios, has gained significant attention from both academic and industrial communities, leading to the emergence of a research task known as Multimodal Misinformation Detection (MMD). Typically, current MMD approaches focus on capturing the semantic relationships and inconsistency between various modalities but often overlook certain critical indicators within multimodal content. Recent research has shown that manipulated features within visual content in social media articles serve as valuable clues for MMD. Meanwhile, we argue that the potential intentions behind the manipulation, e.g., harmful and harmless, also matter in MMD. Therefore, in this study, we aim to identify such multimodal misinformation by capturing two types of features: manipulation features, which represent if visual content has been manipulated, and intention features, which assess the nature of these manipulations, distinguishing between harmful and harmless intentions. Unfortunately, the manipulation and intention labels that supervise these features to be discriminative are unknown. To address this, we introduce two weakly supervised indicators as substitutes by incorporating supplementary datasets focused on image manipulation detection and framing two different classification tasks as positive and unlabeled learning issues. With this framework, we introduce an innovative MMD approach, titled Harmful Visual Content Manipulation Matters in MMD (HAVC-M4 D). Comprehensive experiments conducted on four prevalent MMD datasets indicate that HAVC-M4 D significantly and consistently enhances the performance of existing MMD methods.

翻译：当前，虚假信息在众多社交媒体平台上的广泛传播已对社会造成严重负面影响。为应对这一挑战，自动检测虚假信息（尤其在多模态场景下）已引起学术界和工业界的广泛关注，由此催生了被称为多模态虚假信息检测（MMD）的研究任务。通常，现有MMD方法侧重于捕获不同模态间的语义关系与不一致性，但往往忽视了多模态内容中的若干关键指标。最新研究表明，社交媒体文章中视觉内容的篡改特征可作为MMD的重要线索。同时，我们认为篡改行为背后的潜在意图（例如有害与无害）同样对MMD至关重要。因此，本研究旨在通过捕获两类特征来识别此类多模态虚假信息：操纵特征（判断视觉内容是否被篡改）与意图特征（评估篡改行为的性质，区分有害与无害意图）。然而，监督这些特征具有区分性的操纵标签和意图标签是未知的。为此，我们引入两个弱监督指标作为替代：通过集成专注于图像篡改检测的辅助数据集，并将两个不同的分类任务构建为正例与无标签学习问题。基于此框架，我们提出一种创新的MMD方法——有害视觉内容操纵在多模态虚假信息检测中至关重要（HAVC-M4D）。在四个主流MMD数据集上的全面实验表明，HAVC-M4D能够显著且持续地提升现有MMD方法的性能。