Social media misinformation harms individuals and societies and is potentialized by fast-growing multi-modal content (i.e., texts and images), which accounts for higher "credibility" than text-only news pieces. Although existing supervised misinformation detection methods have obtained acceptable performances in key setups, they may require large amounts of labeled data from various events, which can be time-consuming and tedious. In turn, directly training a model by leveraging a publicly available dataset may fail to generalize due to domain shifts between the training data (a.k.a. source domains) and the data from target domains. Most prior work on domain shift focuses on a single modality (e.g., text modality) and ignores the scenario where sufficient unlabeled target domain data may not be readily available in an early stage. The lack of data often happens due to the dynamic propagation trend (i.e., the number of posts related to fake news increases slowly before catching the public attention). We propose a novel robust domain and cross-modal approach (\textbf{RDCM}) for multi-modal misinformation detection. It reduces the domain shift by aligning the joint distribution of textual and visual modalities through an inter-domain alignment module and bridges the semantic gap between both modalities through a cross-modality alignment module. We also propose a framework that simultaneously considers application scenarios of domain generalization (in which the target domain data is unavailable) and domain adaptation (in which unlabeled target domain data is available). Evaluation results on two public multi-modal misinformation detection datasets (Pheme and Twitter Datasets) evince the superiority of the proposed model. The formal implementation of this paper can be found in this link: https://github.com/less-and-less-bugs/RDCM
翻译:社交媒体虚假信息对个人和社会造成危害,而快速增长的多模态内容(即文本和图像)进一步加剧了这一危害,这类内容相比纯文本新闻具有更高的“可信度”。尽管现有的监督式虚假信息检测方法在关键场景中取得了可接受的表现,但这类方法通常需要大量标注数据覆盖不同事件,这一过程既耗时又繁琐。相应地,直接利用公开数据集训练模型可能因训练数据(即源域)与目标域数据之间的领域偏移而难以泛化。现有领域偏移研究大多聚焦于单模态(如文本模态),且忽略了在早期阶段可能无法获得充足无标注目标域数据的情况。这种数据匮乏常源于动态传播趋势(即假新闻相关帖子的数量在引发公众关注前增长缓慢)。我们提出了一种新颖的鲁棒领域与跨模态方法(RDCM)用于多模态虚假信息检测。该方法通过域间对齐模块对齐文本与视觉模态的联合分布以减少领域偏移,并通过跨模态对齐模块弥合两种模态之间的语义鸿沟。我们还提出了一个通用框架,同时考虑领域泛化(目标域数据不可用)和领域自适应(无标注目标域数据可用)两种应用场景。在两个公开多模态虚假信息检测数据集(Pheme和Twitter数据集)上的评估结果证明了所提模型的优越性。本文的正式实现代码可见此链接:https://github.com/less-and-less-bugs/RDCM