Detecting Extreme Ideologies in Shifting Landscapes: an Automatic & Context-Agnostic Approach

In democratic countries, the ideology landscape is foundational to individual and collective political action; conversely, fringe ideology drives Ideologically Motivated Violent Extremism (IMVE). Therefore, quantifying ideology is a crucial first step to an ocean of downstream problems, such as; understanding and countering IMVE, detecting and intervening in disinformation campaigns, and broader empirical opinion dynamics modeling. However, online ideology detection faces two significant hindrances. Firstly, the ground truth that forms the basis for ideology detection is often prohibitively labor-intensive for practitioners to collect, requires access to domain experts and is specific to the context of its collection (i.e., time, location, and platform). Secondly, to circumvent this expense, researchers generate ground truth via other ideological signals (like hashtags used or politicians followed). However, the bias this introduces has not been quantified and often still requires expert intervention. This work presents an end-to-end ideology detection pipeline applicable to large-scale datasets. We construct context-agnostic and automatic ideological signals from widely available media slant data; show the derived pipeline is performant, compared to pipelines of common ideology signals and state-of-the-art baselines; employ the pipeline for left-right ideology, and (the more concerning) detection of extreme ideologies; generate psychosocial profiles of the inferred ideological groups; and, generate insights into their morality and preoccupations.

翻译：在民主国家中，意识形态格局是个人与集体政治行动的基础；相反，边缘意识形态驱动着意识形态驱动的暴力极端主义（IMVE）。因此，量化意识形态是通往一系列下游问题的关键第一步，例如：理解并应对IMVE、检测并干预虚假信息宣传活动，以及更广泛的实证观点动态建模。然而，在线意识形态检测面临两大显著障碍。首先，构成意识形态检测基础的真实标注（ground truth）往往因过于耗费人力而难以由实践者收集，需要接触领域专家，并且具有其收集背景（即时间、地点和平台）的特异性。其次，为规避这一高昂成本，研究人员通过其他意识形态信号（如使用的标签或关注的政客）来生成真实标注。但这种方式引入的偏差尚未被量化，且往往仍需要专家干预。本研究提出了一种适用于大规模数据集的端到端意识形态检测流程。我们从广泛可用的媒体倾向数据中构建了与情境无关且自动化的意识形态信号；证明该衍生的流程相较于基于常见意识形态信号及最先进基线的流程具有更优性能；将该流程应用于左右翼意识形态以及（更令人担忧的）极端意识形态检测；为推断出的意识形态群体生成社会心理画像；并揭示其道德观念与主要关注点的深刻见解。