Recent advances show that two-stream approaches have achieved outstanding performance in hateful meme detection. However, hateful memes constantly evolve as new memes emerge by fusing progressive cultural ideas, making existing methods obsolete or ineffective. In this work, we explore the potential of Large Multimodal Models (LMMs) for hateful meme detection. To this end, we propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting, by integrating the evolution attribute and in-context information of memes. Specifically, Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner. First, an evolutionary pair mining module retrieves the top-k most similar memes in the external curated meme set with the input meme. Second, an evolutionary information extractor is designed to summarize the semantic regularities between the paired memes for prompting. Finally, a contextual relevance amplifier enhances the in-context hatefulness information to boost the search for evolutionary processes. Extensive experiments on public FHM, MAMI, and HarM datasets show that CoE prompting can be incorporated into existing LMMs to improve their performance. More encouragingly, it can serve as an interpretive tool to promote the understanding of the evolution of social memes.
翻译:近期研究表明,双流方法在仇恨性表情包检测方面取得了卓越性能。然而,随着融合渐进文化理念的新表情包不断涌现,仇恨性表情包持续演化,导致现有方法逐渐过时或失效。本研究探索了大型多模态模型在仇恨性表情包检测中的潜力。为此,我们提出Evolver框架,通过整合表情包的演化属性与上下文信息,采用演化链提示机制将大型多模态模型融入检测流程。具体而言,Evolver通过逐步推理模拟表情包的演化表达过程:首先,演化对挖掘模块从外部构建的表情包集合中检索与输入表情包最相似的top-k个表情包;其次,演化信息提取器设计用于总结配对表情包间的语义规律以生成提示;最后,上下文相关性增强器强化上下文中的仇恨信息以优化演化过程搜索。在公开数据集FHM、MAMI和HarM上的大量实验表明,演化链提示机制能够有效融入现有大型多模态模型并提升其性能。更令人鼓舞的是,该机制可作为解释性工具促进对社会性表情包演化过程的理解。