KID: Knowledge-Injected Dual-Head Learning for Knowledge-Grounded Harmful Meme Detection

Internet memes have become pervasive carriers of digital culture on social platforms. However, their heavy reliance on metaphors and sociocultural context also makes them subtle vehicles for harmful content, posing significant challenges for automated content moderation. Existing approaches primarily focus on intra-modal and inter-modal signal analysis, while the understanding of implicit toxicity often depends on background knowledge that is not explicitly present in the meme itself. To address this challenge, we propose KID, a Knowledge-Injected Dual-Head Learning framework for knowledge-grounded harmful meme detection. KID adopts a label-constrained distillation paradigm to decompose complex meme understanding into structured reasoning chains that explicitly link visual evidence, background knowledge, and classification labels. These chains guide the learning process by grounding external knowledge in meme-specific contexts. In addition, KID employs a dual-head architecture that jointly optimizes semantic generation and classification objectives, enabling aligned linguistic reasoning while maintaining stable decision boundaries. Extensive experiments on five multilingual datasets spanning English, Chinese, and low-resource Bengali demonstrate that KID achieves SOTA performance on both binary and multi-label harmful meme detection tasks, improving over previous best methods by 2.1%--19.7% across primary evaluation metrics. Ablation studies further confirm the effectiveness of knowledge injection and dual-head joint learning, highlighting their complementary contributions to robust and generalizable meme understanding. The code and data are available at https://github.com/PotatoDog1669/KID.

翻译：互联网表情包已成为社交平台上普遍存在的数字文化载体。然而，其对隐喻和社会文化背景的高度依赖也使其成为有害内容的隐蔽传播工具，给自动化内容审核带来了重大挑战。现有方法主要关注模态内和模态间信号分析，而对隐含毒性的理解往往依赖于表情包本身未明确呈现的背景知识。为应对这一挑战，我们提出了KID（Knowledge-Injected Dual-Head Learning），一个用于知识驱动有害表情包检测的知识注入双头学习框架。KID采用标签约束蒸馏范式，将复杂的表情包理解分解为显式连接视觉证据、背景知识和分类标签的结构化推理链。这些推理链通过将外部知识锚定于表情包特定语境中来指导学习过程。此外，KID采用双头架构联合优化语义生成与分类目标，在保持稳定决策边界的同时实现对齐的语言推理。在涵盖英语、中文及低资源孟加拉语的五个多语言数据集上进行的大量实验表明，KID在二分类和多标签有害表情包检测任务上均实现了最先进的性能，在主要评估指标上较先前最佳方法提升2.1%--19.7%。消融实验进一步证实了知识注入与双头联合学习的有效性，凸显了二者对构建鲁棒且可泛化的表情包理解系统的互补贡献。代码与数据已公开于https://github.com/PotatoDog1669/KID。