Recently a lot of progress has been made in rumor modeling and rumor detection for micro-blogging streams. However, existing automated methods do not perform very well for early rumor detection, which is crucial in many settings, e.g., in crisis situations. One reason for this is that aggregated rumor features such as propagation features, which work well on the long run, are - due to their accumulating characteristic - not very helpful in the early phase of a rumor. In this work, we present an approach for early rumor detection, which leverages Convolutional Neural Networks for learning the hidden representations of individual rumor-related tweets to gain insights on the credibility of each tweets. We then aggregate the predictions from the very beginning of a rumor to obtain the overall event credits (so-called wisdom), and finally combine it with a time series based rumor classification model. Our extensive experiments show a clearly improved classification performance within the critical very first hours of a rumor. For a better understanding, we also conduct an extensive feature evaluation that emphasized on the early stage and shows that the low-level credibility has best predictability at all phases of the rumor lifetime.
翻译:近年来,微博流中的谣言建模与检测研究取得了显著进展。然而,现有自动化方法在早期谣言检测(如危机情境等关键场景)中表现欠佳。究其原因,聚合性谣言特征(如传播特征)虽在长期监测中表现优异,但由于其累积特性,在谣言初期阶段效用有限。本文提出一种早期谣言检测方法,利用卷积神经网络学习单条谣言相关推文的隐层表征,从而获取每条推文的可信度。我们进一步从谣言传播初始阶段聚合预测结果,获得整体事件可信度(即"群体智慧"),最终将其与基于时间序列的谣言分类模型相结合。大量实验表明,该方法在谣言传播的关键前几小时内显著提升了分类性能。为深入理解模型机理,我们进行了针对性的特征评估,重点分析早期阶段特征表现:结果表明,低层级可信度在谣言生命周期各阶段均具有最佳可预测性。