The proliferation of low-quality online information in today's era has underscored the need for robust and automatic mechanisms to evaluate the trustworthiness of online news publishers. In this paper, we analyse the trustworthiness of online news media outlets by leveraging a dataset of 4033 news stories from 40 different sources. We aim to infer the trustworthiness level of the source based on the classification of individual articles' content. The trust labels are obtained from NewsGuard, a journalistic organization that evaluates news sources using well-established editorial and publishing criteria. The results indicate that the classification model is highly effective in classifying the trustworthiness levels of the news articles. This research has practical applications in alerting readers to potentially untrustworthy news sources, assisting journalistic organizations in evaluating new or unfamiliar media outlets and supporting the selection of articles for their trustworthiness assessment.
翻译:当今时代低质量在线信息的泛滥凸显了建立健壮且自动化的机制以评估在线新闻发布者可信度的迫切需求。本文利用涵盖40个不同来源的4033篇新闻报道数据集,分析在线新闻媒体的可信度。我们旨在通过分类单篇文章内容来推断其来源的可信度水平。可信标签来自新闻卫报(NewsGuard),这是一家采用成熟编辑与出版标准评估新闻来源的新闻专业组织。结果表明,分类模型在区分新闻文章可信度等级方面具有显著有效性。本研究具有实际应用价值:可提醒读者关注潜在不可信的新闻来源,协助新闻专业组织评估新兴或不熟悉的媒体机构,并为其可信度评估提供文章筛选支持。