The coronavirus pandemic (COVID-19) is probably the most disruptive global health disaster in recent history. It negatively impacted the whole world and virtually brought the global economy to a standstill. However, as the virus was spreading, infecting people and claiming thousands of lives so was the spread and propagation of fake news, misinformation and disinformation about the event. These included the spread of unconfirmed health advice and remedies on social media. In this paper, false information about the pandemic is identified using a content-based approach and metadata curated from messages posted to online social networks. A content-based approach combined with metadata as well as an initial feature analysis is used and then several supervised learning models are tested for identifying and predicting misleading posts. Our approach shows up to 93% accuracy in the detection of fake news related posts about the COVID-19 pandemic
翻译:冠状病毒大流行(COVID-19)很可能是近代史上最具破坏性的全球健康灾难。它给整个世界带来了负面影响,使全球经济几乎陷入停滞。然而,随着病毒的传播、感染人群并夺走数千人的生命,关于此次事件的假新闻、错误信息和虚假信息的传播也愈演愈烈。这其中包括社交媒体上未经证实的健康建议和疗法的传播。本文采用基于内容的方法以及从在线社交网络发布的消息中提取的元数据来识别关于疫情的错误信息。我们结合了基于内容的方法与元数据,并进行了初始特征分析,随后测试了多种监督学习模型来识别和预测误导性帖子。我们的方法在检测与COVID-19大流行相关的假新闻帖子时,准确率高达93%。