The unchecked spread of digital information, combined with increasing political polarization and the tendency of individuals to isolate themselves from opposing political viewpoints, has driven researchers to develop systems for automatically detecting political bias in media. This trend has been further fueled by discussions on social media. We explore methods for categorizing bias in US news articles, comparing rule-based and deep learning approaches. The study highlights the sensitivity of modern self-learning systems to unconstrained data ingestion, while reconsidering the strengths of traditional rule-based systems. Applying both models to left-leaning (CNN) and right-leaning (FOX) news articles, we assess their effectiveness on data beyond the original training and test sets.This analysis highlights each model's accuracy, offers a framework for exploring deep-learning explainability, and sheds light on political bias in US news media. We contrast the opaque architecture of a deep learning model with the transparency of a linguistically informed rule-based model, showing that the rule-based model performs consistently across different data conditions and offers greater transparency, whereas the deep learning model is dependent on the training set and struggles with unseen data.
翻译:数字信息的无节制传播,加上日益加剧的政治两极分化以及个体倾向于隔离对立政治观点的趋势,促使研究人员开发自动检测媒体政治偏见的系统。社交媒体上的讨论进一步推动了这一趋势。我们探索了美国新闻文章中偏见分类的方法,比较了基于规则和深度学习两种途径。本研究强调了现代自学习系统对无约束数据输入的敏感性,同时重新审视了传统基于规则系统的优势。通过将两种模型应用于左倾(CNN)和右倾(FOX)新闻文章,我们评估了它们在原始训练和测试集之外数据上的有效性。此分析突出了每种模型的准确性,提供了一个探索深度学习可解释性的框架,并揭示了美国新闻媒体中的政治偏见。我们将深度学习模型的不透明架构与基于语言知识规则模型的透明度进行对比,结果表明基于规则的模型在不同数据条件下表现一致且提供更高的透明度,而深度学习模型则依赖于训练集,在处理未见数据时面临困难。