Predicting Viral Rumors and Vulnerable Users for Infodemic Surveillance

In the age of the infodemic, it is crucial to have tools for effectively monitoring the spread of rampant rumors that can quickly go viral, as well as identifying vulnerable users who may be more susceptible to spreading such misinformation. This proactive approach allows for timely preventive measures to be taken, mitigating the negative impact of false information on society. We propose a novel approach to predict viral rumors and vulnerable users using a unified graph neural network model. We pre-train network-based user embeddings and leverage a cross-attention mechanism between users and posts, together with a community-enhanced vulnerability propagation (CVP) method to improve user and propagation graph representations. Furthermore, we employ two multi-task training strategies to mitigate negative transfer effects among tasks in different settings, enhancing the overall performance of our approach. We also construct two datasets with ground-truth annotations on information virality and user vulnerability in rumor and non-rumor events, which are automatically derived from existing rumor detection datasets. Extensive evaluation results of our joint learning model confirm its superiority over strong baselines in all three tasks: rumor detection, virality prediction, and user vulnerability scoring. For instance, compared to the best baselines based on the Weibo dataset, our model makes 3.8\% and 3.0\% improvements on Accuracy and MacF1 for rumor detection, and reduces mean squared error (MSE) by 23.9\% and 16.5\% for virality prediction and user vulnerability scoring, respectively. Our findings suggest that our approach effectively captures the correlation between rumor virality and user vulnerability, leveraging this information to improve prediction performance and provide a valuable tool for infodemic surveillance.

翻译：在信息疫情时代，开发有效监控快速传播的泛滥谣言以及识别可能更易传播此类错误信息的易感用户至关重要。这种主动方法能够及时采取预防措施，减轻虚假信息对社会的负面影响。我们提出了一种新颖的方法，使用统一的图神经网络模型来预测病毒性谣言和易感用户。我们预训练基于网络的用户嵌入，并利用用户与帖子之间的交叉注意力机制，结合社区增强的漏洞传播（CVP）方法，以改进用户和传播图表示。此外，我们采用两种多任务训练策略来缓解不同设置下任务间的负迁移效应，从而提升我们方法的整体性能。我们还构建了两个数据集，包含关于谣言和非谣言事件中信息病毒性与用户易感性的真实标注，这些数据自动来源于现有的谣言检测数据集。对我们联合学习模型的广泛评估结果证实，它在所有三个任务（谣言检测、病毒性预测和用户易感性评分）中均优于强基线方法。例如，与基于微博数据集的最佳基线相比，我们的模型在谣言检测的准确率和MacF1上分别提升了3.8%和3.0%，在病毒性预测和用户易感性评分上分别将均方误差（MSE）降低了23.9%和16.5%。我们的研究结果表明，该方法有效捕获了谣言病毒性与用户易感性之间的相关性，利用这一信息提升预测性能，为信息疫情监控提供了有价值的工具。