Misinformation is a prevalent societal issue due to its potential high risks. Out-of-context (OOC) misinformation, where authentic images are repurposed with false text, is one of the easiest and most effective ways to mislead audiences. Current methods focus on assessing image-text consistency but lack convincing explanations for their judgments, which is essential for debunking misinformation. While Multimodal Large Language Models (MLLMs) have rich knowledge and innate capability for visual reasoning and explanation generation, they still lack sophistication in understanding and discovering the subtle crossmodal differences. In this paper, we introduce SNIFFER, a novel multimodal large language model specifically engineered for OOC misinformation detection and explanation. SNIFFER employs two-stage instruction tuning on InstructBLIP. The first stage refines the model's concept alignment of generic objects with news-domain entities and the second stage leverages language-only GPT-4 generated OOC-specific instruction data to fine-tune the model's discriminatory powers. Enhanced by external tools and retrieval, SNIFFER not only detects inconsistencies between text and image but also utilizes external knowledge for contextual verification. Our experiments show that SNIFFER surpasses the original MLLM by over 40% and outperforms state-of-the-art methods in detection accuracy. SNIFFER also provides accurate and persuasive explanations as validated by quantitative and human evaluations.
翻译:虚假信息因其潜在的高风险而成为普遍存在的社会问题。脱离上下文(OOC)虚假信息(即利用真实图像搭配虚假文本)是误导受众最简便且最有效的方式之一。现有方法侧重于评估图像-文本一致性,但缺乏对其判断结果的可信解释,而这正是揭穿虚假信息的关键。尽管多模态大语言模型(MLLM)具备丰富的知识以及视觉推理与解释生成的先天能力,但仍难以精准理解和发现跨模态间的细微差异。本文提出SNIFFER——一种专为OOC虚假信息检测与解释设计的新型多模态大语言模型。SNIFFER基于InstructBLIP采用两阶段指令微调:第一阶段优化模型对通用对象与新闻域实体的概念对齐能力,第二阶段利用仅基于语言GPT-4生成的OOC专用指令数据微调模型的判别能力。通过外部工具与检索增强,SNIFFER不仅能检测文本与图像间的不一致性,还能利用外部知识进行上下文验证。实验表明,SNIFFER在检测精度上相较于原始MLLM提升超过40%,并超越了现有最优方法。经定量评估与人工评测验证,SNIFFER还可提供准确且具有说服力的解释。