The spread of fake news on social media poses significant threats to individuals and society. Text-based and graph-based models have been employed for fake news detection by analysing news content and propagation networks, showing promising results in specific scenarios. However, these data-driven models heavily rely on pre-existing in-distribution data for training, limiting their performance when confronted with fake news from emerging or previously unseen domains, known as out-of-distribution (OOD) data. Tackling OOD fake news is a challenging yet critical task. In this paper, we introduce the Causal Subgraph-oriented Domain Adaptive Fake News Detection (CSDA) model, designed to enhance zero-shot fake news detection by extracting causal substructures from propagation graphs using in-distribution data and generalising this approach to OOD data. The model employs a graph neural network based mask generation process to identify dominant nodes and edges within the propagation graph, using these substructures for fake news detection. Additionally, the performance of CSDA is further improved through contrastive learning in few-shot scenarios, where a limited amount of OOD data is available for training. Extensive experiments on public social media datasets demonstrate that CSDA effectively handles OOD fake news detection, achieving a 7 to 16 percents accuracy improvement over other state-of-the-art models.
翻译:社交媒体上虚假新闻的传播对个人和社会构成重大威胁。基于文本和基于图表的模型通过分析新闻内容和传播网络已被用于虚假新闻检测,在特定场景中显示出良好效果。然而,这些数据驱动模型严重依赖预先存在的同分布数据进行训练,当面对来自新兴或先前未见领域(即异分布数据)的虚假新闻时,其性能受到限制。处理异分布虚假新闻是一项具有挑战性但至关重要的任务。本文提出了面向因果子结构的领域自适应虚假新闻检测模型,该模型旨在通过使用同分布数据从传播图中提取因果子结构,并将此方法推广至异分布数据,从而增强零样本虚假新闻检测能力。该模型采用基于图神经网络的掩码生成过程来识别传播图中的主导节点和边,并利用这些子结构进行虚假新闻检测。此外,在少量样本场景中,当仅有有限数量的异分布数据可用于训练时,通过对比学习进一步提升了CSDA的性能。在公开社交媒体数据集上的大量实验表明,CSDA能有效处理异分布虚假新闻检测,相比其他最先进模型实现了7%至16%的准确率提升。