Using psychological theory to ground guidelines for the annotation of misogynistic language

Detecting misogynistic hate speech is a difficult algorithmic task. The task is made more difficult when decision criteria for what constitutes misogynistic speech are ungrounded in established literatures in psychology and philosophy, both of which have described in great detail the forms explicit and subtle misogynistic attitudes can take. In particular, the literature on algorithmic detection of misogynistic speech often rely on guidelines that are insufficiently robust or inappropriately justified -- they often fail to include various misogynistic phenomena or misrepresent their importance when they do. As a result, current misogyny detection coding schemes and datasets fail to capture the ways women experience misogyny online. This is of pressing importance: misogyny is on the rise both online and offline. Thus, the scientific community needs to have a systematic, theory informed coding scheme of misogyny detection and a corresponding dataset to train and test models of misogyny detection. To this end, we developed (1) a misogyny annotation guideline scheme informed by theoretical and empirical psychological research, (2) annotated a new dataset achieving substantial inter-rater agreement (kappa = 0.68) and (3) present a case study using Large Language Models (LLMs) to compare our coding scheme to a self-described "expert" misogyny annotation scheme in the literature. Our findings indicate that our guideline scheme surpasses the other coding scheme in the classification of misogynistic texts across 3 datasets. Additionally, we find that LLMs struggle to replicate our human annotator labels, attributable in large part to how LLMs reflect mainstream views of misogyny. We discuss implications for the use of LLMs for the purposes of misogyny detection.

翻译：检测厌女仇恨言论是一项困难的算法任务。当判定何种言论构成厌女言论的决策标准缺乏心理学和哲学既有文献的根基时，这项任务会变得更加困难，而这两个学科都已详细描述了显性和隐性厌女态度可能采取的形式。具体而言，关于厌女言论算法检测的文献常常依赖于不够稳健或论证不当的指南——这些指南往往未能涵盖各种厌女现象，或在涵盖时错误地表述了其重要性。因此，当前的厌女检测编码方案和数据集未能捕捉到女性在网络空间遭遇厌女的方式。这是一个紧迫的问题：厌女现象在网络和线下均呈上升趋势。因此，科学界需要一个系统的、基于理论指导的厌女检测编码方案以及相应的数据集来训练和测试厌女检测模型。为此，我们（1）开发了一套基于理论和实证心理学研究的厌女标注指南方案，（2）标注了一个新的数据集，并实现了较高的评分者间一致性（kappa = 0.68），（3）进行了一项案例研究，使用大型语言模型（LLMs）将我们的编码方案与文献中一个自称“专家”级的厌女标注方案进行比较。我们的研究结果表明，在三个数据集的厌女文本分类任务中，我们的指南方案优于其他编码方案。此外，我们发现LLMs难以复现我们的人工标注标签，这在很大程度上归因于LLMs反映了关于厌女的主流观点。我们讨论了使用LLMs进行厌女检测的启示。