The manifestation and effect of bias in news reporting have been central topics in the social sciences for decades, and have received increasing attention in the NLP community recently. While NLP can help to scale up analyses or contribute automatic procedures to investigate the impact of biased news in society, we argue that methodologies that are currently dominant fall short of addressing the complex questions and effects addressed in theoretical media studies. In this survey paper, we review social science approaches and draw a comparison with typical task formulations, methods, and evaluation metrics used in the analysis of media bias in NLP. We discuss open questions and suggest possible directions to close identified gaps between theory and predictive models, and their evaluation. These include model transparency, considering document-external information, and cross-document reasoning rather than single-label assignment.
翻译:数十年来,新闻报道中偏见的表现及其影响一直是社会科学的核心议题,并近期在自然语言处理(NLP)领域受到日益关注。尽管NLP有助于扩展分析规模或提供自动化程序以研究偏见新闻对社会的影响,但我们认为当前主流方法难以应对理论媒体研究中涉及复杂问题与效应的探讨。在本综述论文中,我们回顾社会科学研究方法,并将其与NLP分析媒体偏见时常用的任务形式、方法和评估指标进行对比。我们探讨未解决问题,并提出弥合理论与预测模型及其评估之间差距的可能方向,包括模型透明度、考虑文档外部信息、以及跨文档推理而非单一标签分配。