Focusing on Relevant Responses for Multi-modal Rumor Detection

In the absence of an authoritative statement about a rumor, people may expose the truth behind such rumor through their responses on social media. Most rumor detection methods aggregate the information of all the responses and have made great progress. However, due to the different backgrounds of users, the responses have different relevance for discovering th suspicious points hidden in a rumor claim. The methods that focus on all the responding tweets would dilute the effect of the critical ones. Moreover, for a multi-modal rumor claim, the focus of a user may be on several words in the text or an object in the image, so the different modalities should be considered to select the relevant responses and verify the claim. In this paper, we propose a novel multi-modal rumor detection model, termed Focal Reasoning Model (FoRM), to filter out the irrelevant responses and further conduct fine-grained reasoning with the multi-modal claim and corresponding responses. Concretely, there are two main components in our FoRM: the coarse-grained selection and the fine-grained reasoning. The coarse-grained selection component leverages the post-level features of the responses to verify the claim and learns a relevant score of each response. Based on the relevant scores, the most relevant responses are reserved as the critical ones to the further reasoning. In the fine-grained reasoning component, we design a relation attention module to explore the fine-grained relations, i.e., token-to-token and token-to-object relations, between the reserved responses and the multi-modal claim for finding out the valuable clues. Extensive experiments have been conducted on two real-world datasets, and the results demonstrate that our proposed model outperforms all the baselines.

翻译：在缺乏关于谣言的权威声明时，人们可能通过社交媒体上的响应揭露谣言背后的真相。大多数谣言检测方法聚合所有响应的信息，并取得了显著进展。然而，由于用户背景不同，响应在发现谣言声称中隐藏的可疑点方面具有不同的相关性。仅关注所有响应推文的方法会削弱关键响应的效果。此外，对于多模态谣言声称，用户的关注点可能集中在文本中的几个词或图像中的一个对象上，因此应考虑不同模态以选择相关响应并验证声称。本文提出一种新颖的多模态谣言检测模型，称为焦点推理模型（FoRM），以过滤不相关的响应，并进一步对多模态声称与对应响应进行细粒度推理。具体而言，我们的FoRM包含两个主要组件：粗粒度选择和细粒度推理。粗粒度选择组件利用响应的帖子级特征验证声称，并为每个响应学习相关性得分。基于相关性得分，最相关的响应被保留为关键响应以进行进一步推理。在细粒度推理组件中，我们设计了一个关系注意力模块，以探索保留响应与多模态声称之间的细粒度关系，即词-词关系和词-对象关系，从而寻找有价值的线索。在两个真实世界数据集上进行了大量实验，结果表明我们提出的模型优于所有基线方法。