Automated stance detection in complex topics and small languages: the challenging case of immigration in polarizing news media

Automated stance detection and related machine learning methods can provide useful insights for media monitoring and academic research. Many of these approaches require annotated training datasets, which limits their applicability for languages where these may not be readily available. This paper explores the applicability of large language models for automated stance detection in a challenging scenario, involving a morphologically complex, lower-resource language, and a socio-culturally complex topic, immigration. If the approach works in this case, it can be expected to perform as well or better in less demanding scenarios. We annotate a large set of pro and anti-immigration examples, and compare the performance of multiple language models as supervised learners. We also probe the usability of ChatGPT as an instructable zero-shot classifier for the same task. Supervised achieves acceptable performance, and ChatGPT yields similar accuracy. This is promising as a potentially simpler and cheaper alternative for text classification tasks, including in lower-resource languages. We further use the best-performing model to investigate diachronic trends over seven years in two corpora of Estonian mainstream and right-wing populist news sources, demonstrating the applicability of the approach for news analytics and media monitoring settings, and discuss correspondences between stance changes and real-world events.

翻译：自动立场检测及相关机器学习方法可为媒体监测与学术研究提供有价值的见解。这类方法大多需要标注训练数据集，这限制了其在缺乏此类数据集的语言中的适用性。本文探讨了大语言模型在复杂场景下（涉及形态复杂的低资源语言与社会文化敏感的移民议题）进行自动立场检测的可行性。若该方法在此案例中有效，则可预期其在要求较低的场景中表现更优或相当。我们标注了大量支持与反对移民的样本，并以监督学习方式比较了多种语言模型的性能。同时，我们评估了ChatGPT作为可指导零样本分类器执行同一任务的适用性。监督学习模型取得了可接受的性能，而ChatGPT也展现出相似的准确率。这为文本分类任务（包括低资源语言场景）提供了一种可能更简单、成本更低的替代方案。我们进一步利用最优模型，分析了爱沙尼亚主流媒体与民粹主义新闻源两个语料库中七年间的历时趋势，论证了该方法在新闻分析与媒体监测场景中的适用性，并讨论了立场变化与现实事件之间的对应关系。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日