Kyle (1985) proposes two types of rumors: informed rumors which are based on some private information and uninformed rumors which are not based on any information (i.e. bluffing). Also, prior studies find that when people have credible source of information, they are likely to use a more confident textual tone in their spreading of rumors. Motivated by these theoretical findings, we propose a double-channel structure to determine the ex-ante veracity of rumors on social media. Our ultimate goal is to classify each rumor into true, false, or unverifiable category. We first assign each text into either certain (informed rumor) or uncertain (uninformed rumor) category. Then, we apply lie detection algorithm to informed rumors and thread-reply agreement detection algorithm to uninformed rumors. Using the dataset of SemEval 2019 Task 7, which requires ex-ante threefold classification (true, false, or unverifiable) of social media rumors, our model yields a macro-F1 score of 0.4027, outperforming all the baseline models and the second-place winner (Gorrell et al., 2019). Furthermore, we empirically validate that the double-channel structure outperforms single-channel structures which use either lie detection or agreement detection algorithm to all posts.
翻译:Kyle (1985) 提出了两种类型的谣言:基于私有信息的知情谣言和缺乏信息依据的非知情谣言(即虚张声势)。同时,既有研究发现,当人们拥有可信的信息来源时,其在传播谣言时更倾向于使用自信的文本语气。受这些理论发现的启发,我们提出了一种双通道结构,用于判定社交媒体谣言的先验真实性。我们的最终目标是将每条谣言分为真实、虚假或无法验证三类。首先,我们将每条文本归入确定类别(知情谣言)或不确定类别(非知情谣言)。随后,对知情谣言应用谎言检测算法,对非知情谣言应用话题-回复一致性检测算法。使用SemEval 2019任务7数据集(该任务要求对社交媒体谣言进行先验三分分类:真实、虚假或无法验证),我们的模型取得了0.4027的宏F1分数,优于所有基线模型及该任务的亚军结果(Gorrell等,2019)。此外,我们通过实验验证了双通道结构优于对所有帖子统一使用谎言检测或一致性检测算法的单通道结构。