MisRoBÆRTa: Transformers versus Misinformation

Misinformation is considered a threat to our democratic values and principles. The spread of such content on social media polarizes society and undermines public discourse by distorting public perceptions and generating social unrest while lacking the rigor of traditional journalism. Transformers and transfer learning proved to be state-of-the-art methods for multiple well-known natural language processing tasks. In this paper, we propose MisRoB{\AE}RTa, a novel transformer-based deep neural ensemble architecture for misinformation detection. MisRoB{\AE}RTa takes advantage of two transformers (BART \& RoBERTa) to improve the classification performance. We also benchmarked and evaluated the performances of multiple transformers on the task of misinformation detection. For training and testing, we used a large real-world news articles dataset labeled with 10 classes, addressing two shortcomings in the current research: increasing the size of the dataset from small to large, and moving the focus of fake news detection from binary classification to multi-class classification. For this dataset, we manually verified the content of the news articles to ensure that they were correctly labeled. The experimental results show that the accuracy of transformers on the misinformation detection problem was significantly influenced by the method employed to learn the context, dataset size, and vocabulary dimension. We observe empirically that the best accuracy performance among the classification models that use only one transformer is obtained by BART, while DistilRoBERTa obtains the best accuracy in the least amount of time required for fine-tuning and training. The proposed MisRoB{\AE}RTa outperforms the other transformer models in the task of misinformation detection. To arrive at this conclusion, we performed ample ablation and sensitivity testing with MisRoB{\AE}RTa on two datasets.

翻译：虚假信息被视为对我们民主价值观和原则的威胁。这类内容在社交媒体上的传播通过歪曲公众认知、引发社会动荡，同时缺乏传统新闻的严谨性，从而加剧社会分化并破坏公共讨论。变换器（Transformer）和迁移学习已被证明是多种著名自然语言处理任务的最先进方法。本文提出了一种基于变换器的新型深度神经集成架构——MisRoBÆRTa，用于虚假信息检测。MisRoBÆRTa利用BART和RoBERTa两种变换器来提升分类性能。我们还对多种变换器在虚假信息检测任务上的表现进行了基准测试与评估。在训练和测试中，我们使用了一个包含10个类别的大规模真实新闻文章数据集，以解决当前研究中的两个不足：将数据集规模从较小扩展至较大，以及将假新闻检测的关注点从二分类转向多分类。对于该数据集，我们手动验证了新闻文章的内容以确保其标签正确。实验结果表明，变换器在虚假信息检测问题上的准确性显著受其上下文学习方法、数据集规模以及词汇维度的影响。我们通过实证观察发现，仅使用单一变换器的分类模型中，BART获得了最佳的准确率表现，而DistilRoBERTa在微调和训练所需时间最短的情况下取得了最佳准确率。所提出的MisRoBÆRTa在虚假信息检测任务中优于其他变换器模型。为得出这一结论，我们在两个数据集上对MisRoBÆRTa进行了充分的消融实验和敏感性测试。