Meme-based social abuse detection is challenging because harmful intent often relies on implicit cultural symbolism and subtle cross-modal incongruence. Prior approaches, from fusion-based methods to in-context learning with Large Vision-Language Models (LVLMs), have made progress but remain limited by three factors: i) cultural blindness (missing symbolic context), ii) boundary ambiguity (satire vs. abuse confusion), and iii) lack of interpretability (opaque model reasoning). We introduce CROSS-ALIGN+, a three-stage framework that systematically addresses these limitations: (1) Stage I mitigates cultural blindness by enriching multimodal representations with structured knowledge from ConceptNet, Wikidata, and Hatebase; (2) Stage II reduces boundary ambiguity through parameter-efficient LoRA adapters that sharpen decision boundaries; and (3) Stage III enhances interpretability by generating cascaded explanations. Extensive experiments on five benchmarks and eight LVLMs demonstrate that CROSS-ALIGN+ consistently outperforms state-of-the-art methods, achieving up to 17% relative F1 improvement while providing interpretable justifications for each decision.
翻译:基于表情包的社交辱骂检测具有挑战性,因为有害意图往往依赖于隐性的文化象征意义和细微的跨模态不一致性。现有方法——从基于融合的方法到使用大型视觉语言模型(LVLMs)的上下文学习——虽已取得进展,但仍受限于三个因素:i) 文化盲区(忽略象征性语境),ii) 边界模糊性(讽刺与辱骂的混淆),以及 iii) 可解释性不足(模型推理不透明)。我们提出了CROSS-ALIGN+,一个三阶段框架,系统性地解决了这些局限:(1) 第一阶段通过整合来自ConceptNet、Wikidata和Hatebase的结构化知识来丰富多模态表征,从而缓解文化盲区;(2) 第二阶段通过参数高效的LoRA适配器锐化决策边界,以减少边界模糊性;(3) 第三阶段通过生成级联解释来增强可解释性。在五个基准数据集和八个LVLM上进行的大量实验表明,CROSS-ALIGN+始终优于最先进的方法,实现了高达17%的相对F1分数提升,同时为每个决策提供了可解释的论证。