With the development of social media, rumors have been spread broadly on social media platforms, causing great harm to society. Beside textual information, many rumors also use manipulated images or conceal textual information within images to deceive people and avoid being detected, making multimodal rumor detection be a critical problem. The majority of multimodal rumor detection methods mainly concentrate on extracting features of source claims and their corresponding images, while ignoring the comments of rumors and their propagation structures. These comments and structures imply the wisdom of crowds and are proved to be crucial to debunk rumors. Moreover, these methods usually only extract visual features in a basic manner, seldom consider tampering or textual information in images. Therefore, in this study, we propose a novel Vision and Graph Fused Attention Network (VGA) for rumor detection to utilize propagation structures among posts so as to obtain the crowd opinions and further explore visual tampering features, as well as the textual information hidden in images. We conduct extensive experiments on three datasets, demonstrating that VGA can effectively detect multimodal rumors and outperform state-of-the-art methods significantly.
翻译:随着社交媒体的发展,谣言在社交平台上广泛传播,对社会造成了巨大危害。除文本信息外,许多谣言还利用篡改图像或将文本信息隐藏在图像中的方式欺骗用户并逃避检测,使得多模态谣言检测成为一个关键问题。现有的多模态谣言检测方法主要集中于提取源帖文及其对应图像的特征,而忽略了谣言的评论及其传播结构。这些评论和结构蕴含着群体智慧,并被证明对辟谣至关重要。此外,这些方法通常仅以基础方式提取视觉特征,很少考虑图像中的篡改或文本信息。因此,本研究提出了一种新颖的视觉与图融合注意力网络(VGA)用于谣言检测,以利用帖文间的传播结构获取群体意见,并进一步挖掘视觉篡改特征以及图像中隐藏的文本信息。我们在三个数据集上进行了广泛实验,结果表明VGA能够有效检测多模态谣言,并显著优于现有最先进方法。