News consumption has significantly increased with the growing popularity and use of web-based forums and social media. This sets the stage for misinforming and confusing people. To help reduce the impact of misinformation on users' potential health-related decisions and other intents, it is desired to have machine learning models to detect and combat fake news automatically. This paper proposes a novel transformer-based model using Capsule neural Networks(CapsNet) called X-CapsNet. This model includes a CapsNet with dynamic routing algorithm paralyzed with a size-based classifier for detecting short and long fake news statements. We use two size-based classifiers, a Deep Convolutional Neural Network (DCNN) for detecting long fake news statements and a Multi-Layer Perceptron (MLP) for detecting short news statements. To resolve the problem of representing short news statements, we use indirect features of news created by concatenating the vector of news speaker profiles and a vector of polarity, sentiment, and counting words of news statements. For evaluating the proposed architecture, we use the Covid-19 and the Liar datasets. The results in terms of the F1-score for the Covid-19 dataset and accuracy for the Liar dataset show that models perform better than the state-of-the-art baselines.
翻译:随着网络论坛和社交媒体的普及使用,新闻消费量显著增加,这为错误信息传播和混淆公众提供了条件。为减轻错误信息对用户潜在健康决策及其他意图的影响,亟需借助机器学习模型自动检测并打击虚假新闻。本文提出一种基于Transformer的新型胶囊神经网络模型,命名为X-CapsNet。该模型采用动态路由算法的胶囊网络,并与基于文本长度的分类器并行协作,以检测长、短两种虚假新闻陈述。我们使用两种基于文本长度的分类器:深度卷积神经网络(DCNN)用于检测长虚假新闻陈述,多层感知机(MLP)用于检测短虚假新闻陈述。为解决短新闻陈述的表示问题,我们采用新闻的间接特征,这些特征通过拼接新闻发言人的特征向量、极性特征、情感特征及新闻语句的词汇计数向量生成。为评估所提出的架构,我们使用Covid-19数据集和Liar数据集。实验结果表明,在Covid-19数据集上的F1分数及Liar数据集上的准确率方面,该模型均优于现有最先进的基线方法。