The dissemination of hateful memes online has adverse effects on social media platforms and the real world. Detecting hateful memes is challenging, one of the reasons being the evolutionary nature of memes; new hateful memes can emerge by fusing hateful connotations with other cultural ideas or symbols. In this paper, we propose a framework that leverages multimodal contrastive learning models, in particular OpenAI's CLIP, to identify targets of hateful content and systematically investigate the evolution of hateful memes. We find that semantic regularities exist in CLIP-generated embeddings that describe semantic relationships within the same modality (images) or across modalities (images and text). Leveraging this property, we study how hateful memes are created by combining visual elements from multiple images or fusing textual information with a hateful image. We demonstrate the capabilities of our framework for analyzing the evolution of hateful memes by focusing on antisemitic memes, particularly the Happy Merchant meme. Using our framework on a dataset extracted from 4chan, we find 3.3K variants of the Happy Merchant meme, with some linked to specific countries, persons, or organizations. We envision that our framework can be used to aid human moderators by flagging new variants of hateful memes so that moderators can manually verify them and mitigate the problem of hateful content online.
翻译:仇恨性模因在互联网上的传播对社交媒体平台及现实世界均产生负面影响。检测仇恨性模因极具挑战性,其重要原因之一在于模因具有演化特性:通过将仇恨性内涵与其他文化理念或符号相融合,新型仇恨性模因可不断衍生。本文提出一个框架,该框架利用多模态对比学习模型(特别是OpenAI的CLIP模型)来识别仇恨性内容的目标,并系统研究仇恨性模因的演化规律。我们发现,CLIP生成的嵌入向量中存在语义规律性,这种规律性既能描述同一模态(图像)内的语义关系,也可跨越不同模态(图像与文本)。基于这一特性,我们探究了如何通过组合多张图像中的视觉元素,或将文本信息与仇恨性图像相融合来创造仇恨性模因。我们聚焦反犹太主义模因(尤其是"快乐商人"模因),展示了该框架在分析仇恨性模因演化方面的能力。通过在从4chan平台提取的数据集上应用该框架,我们发现了3300个"快乐商人"模因变体,其中部分变体与特定国家、个人或组织存在关联。我们预见,该框架可标记新型仇恨性模因变体,从而辅助人工审核员进行核实,最终缓解互联网仇恨性内容问题。