Online hate speech proliferation has created a difficult problem for social media platforms. A particular challenge relates to the use of coded language by groups interested in both creating a sense of belonging for its users and evading detection. Coded language evolves quickly and its use varies over time. This paper proposes a methodology for detecting emerging coded hate-laden terminology. The methodology is tested in the context of online antisemitic discourse. The approach considers posts scraped from social media platforms, often used by extremist users. The posts are scraped using seed expressions related to previously known discourse of hatred towards Jews. The method begins by identifying the expressions most representative of each post and calculating their frequency in the whole corpus. It filters out grammatically incoherent expressions as well as previously encountered ones so as to focus on emergent well-formed terminology. This is followed by an assessment of semantic similarity to known antisemitic terminology using a fine-tuned large language model, and subsequent filtering out of the expressions that are too distant from known expressions of hatred. Emergent antisemitic expressions containing terms clearly relating to Jewish topics are then removed to return only coded expressions of hatred.
翻译:在线仇恨言论的激增给社交媒体平台带来了棘手问题。其中一个特殊挑战在于,那些既想营造用户归属感、又想规避检测的群体使用了编码语言。编码语演化迅速,且其用法随时间推移而变化。本文提出了一种检测新兴编码仇恨术语的方法。该方法在在线反犹话语背景下进行了测试。研究考虑了从极端用户常用的社交媒体平台上抓取的帖子。这些帖子使用与先前已知的仇恨犹太人话语相关的种子词条进行抓取。该方法首先识别每篇帖子中最具代表性的词条,并计算它们在整个语料库中的使用频率。通过过滤掉语法不通顺的词条以及先前已出现的词条,聚焦于新兴的、结构良好的术语。随后,利用微调后的大语言模型评估这些术语与已知反犹术语的语义相似度,并进一步过滤掉与已知仇恨表述距离过远的词条。最后,移除那些明确涉及犹太主题的新兴反犹术语,仅保留编码形式的仇恨表达。