The rise of populism concerns many political scientists and practitioners, yet the detection of its underlying language remains fragmentary. This paper aims to provide a reliable, valid, and scalable approach to measure populist stances. For that purpose, we created an annotated dataset based on parliamentary speeches of the German Bundestag (2013 to 2021). Following the ideational definition of populism, we label moralizing references to the virtuous people or the corrupt elite as core dimensions of populist language. To identify, in addition, how the thin ideology of populism is thickened, we annotate how populist statements are attached to left-wing or right-wing host ideologies. We then train a transformer-based model (PopBERT) as a multilabel classifier to detect and quantify each dimension. A battery of validation checks reveals that the model has a strong predictive accuracy, provides high qualitative face validity, matches party rankings of expert surveys, and detects out-of-sample text snippets correctly. PopBERT enables dynamic analyses of how German-speaking politicians and parties use populist language as a strategic device. Furthermore, the annotator-level data may also be applied in cross-domain applications or to develop related classifiers.
翻译:民粹主义的兴起令众多政治学者和从业者担忧,然而对其潜在语言的检测仍零散不全。本文旨在提供一种可靠、有效且可扩展的测量民粹主义立场的方法。为此,我们基于德国联邦议院(2013年至2021年)的议会演讲创建了一个标注数据集。遵循民粹主义的概念定义,我们将对“高尚人民”或“腐败精英”的道德化指代标注为民粹主义语言的核心维度。此外,为识别薄弱的民粹主义意识形态如何被增强,我们还标注了民粹主义言论如何与左翼或右翼宿主意识形态相关联。随后,我们训练了一个基于Transformer的模型(PopBERT)作为多标签分类器,用于检测和量化每个维度。一系列验证检验表明,该模型具有强大的预测准确性、高度定性的表面效度、与专家调查的政党排名相匹配,并能正确识别样本外文本片段。PopBERT能够动态分析德语政治人物及政党如何将民粹主义语言作为战略工具使用。此外,标注者层级的数据还可应用于跨领域研究或开发相关分类器。