As a multimodal medium combining images and text, memes frequently convey implicit harmful content through metaphors and humor, rendering the detection of harmful memes a complex and challenging task. Although recent studies have made progress in detection accuracy and interpretability, large-scale, high-quality datasets for harmful memes remain scarce, and current methods still struggle to capture implicit risks and nuanced semantics. Thus, we construct MemeMind, a large-scale harmful meme dataset. Aligned with the international standards and the context of internet, MemeMind provides detailed Chain-of-Thought (CoT) reasoning annotations to support fine-grained analysis of implicit intentions in memes. Based on this dataset, we further propose MemeGuard, a reasoning-oriented multimodal detection model that significantly improves both the accuracy of harmful meme detection and the interpretability of model decisions. Extensive experimental results demonstrate that MemeGuard outperforms existing state-of-the-art methods on the MemeMind dataset, establishing a solid foundation for future research in harmful meme detection.
翻译:作为一种融合图像与文本的多模态媒介,梗图常通过隐喻和幽默传达隐含的有害内容,使得有害梗图检测成为一项复杂且具有挑战性的任务。尽管近期研究在检测准确性和可解释性方面取得了进展,但针对有害梗图的大规模高质量数据集仍然稀缺,且现有方法在捕捉隐含风险和细微语义方面仍存在困难。为此,我们构建了MemeMind,一个大规模有害梗图数据集。该数据集遵循国际标准并结合互联网语境,提供了详细的思维链推理标注,以支持对梗图中隐含意图的细粒度分析。基于此数据集,我们进一步提出了MemeGuard,一种面向推理的多模态检测模型,该模型显著提升了有害梗图检测的准确性以及模型决策的可解释性。大量实验结果表明,MemeGuard在MemeMind数据集上优于现有最先进方法,为未来有害梗图检测研究奠定了坚实基础。