The spread of rumors along with breaking events seriously hinders the truth in the era of social media. Previous studies reveal that due to the lack of annotated resources, rumors presented in minority languages are hard to be detected. Furthermore, the unforeseen breaking events not involved in yesterday's news exacerbate the scarcity of data resources. In this work, we propose a novel zero-shot framework based on prompt learning to detect rumors falling in different domains or presented in different languages. More specifically, we firstly represent rumor circulated on social media as diverse propagation threads, then design a hierarchical prompt encoding mechanism to learn language-agnostic contextual representations for both prompts and rumor data. To further enhance domain adaptation, we model the domain-invariant structural features from the propagation threads, to incorporate structural position representations of influential community response. In addition, a new virtual response augmentation method is used to improve model training. Extensive experiments conducted on three real-world datasets demonstrate that our proposed model achieves much better performance than state-of-the-art methods and exhibits a superior capacity for detecting rumors at early stages.
翻译:社交媒体时代,伴随突发事件传播的谣言严重阻碍了真相的传播。先前研究表明,由于标注资源的匮乏,以小众语言呈现的谣言难以被有效检测。此外,未见于昨日新闻的突发事件进一步加剧了数据资源的稀缺性。本文提出一种基于提示学习的新型零样本框架,用于检测跨领域或以不同语言呈现的谣言。具体而言,我们首先将社交媒体上传播的谣言表征为多样化的传播线程,随后设计分层提示编码机制,学习提示与谣言数据中语言无关的上下文表征。为增强领域自适应能力,我们从传播线程中建模领域不变的结构特征,融入具有影响力的社区回应的结构位置表征。同时,采用新型虚拟回复增强方法优化模型训练。在三个真实数据集上进行的大量实验表明,本文提出的模型相比现有最优方法取得了更优性能,并展现出在谣言传播早期阶段进行检测的卓越能力。