Misinformation spreads across web platforms through billions of daily multimodal posts that combine text and images, overwhelming manual fact-checking capacity. Supervised detection models require domain-specific training data and fail to generalize across diverse manipulation tactics. We present MIRAGE, an inference-time, model-pluggable agentic framework that decomposes multimodal verification into four sequential modules: visual veracity assessment detects AI-generated images, cross-modal consistency analysis identifies out-of-context repurposing, retrieval-augmented factual checking grounds claims in web evidence through iterative question generation, and a calibrated judgment module integrates all signals. MIRAGE orchestrates vision-language model reasoning with targeted web retrieval, outputs structured and citation-linked rationales. On MMFakeBench validation set (1,000 samples), MIRAGE with GPT-4o-mini achieves 81.65% F1 and 75.1% accuracy, outperforming the strongest zero-shot baseline (GPT-4V with MMD-Agent at 74.0% F1) by 7.65 points while maintaining 34.3% false positive rate versus 97.3% for a judge-only baseline. Test set results (5,000 samples) confirm generalization with 81.44% F1 and 75.08% accuracy. Ablation studies show visual verification contributes 5.18 F1 points and retrieval-augmented reasoning contributes 2.97 points. Our results demonstrate that decomposed agentic reasoning with web retrieval can match supervised detector performance without domain-specific training, enabling misinformation detection across modalities where labeled data remains scarce.
翻译:虚假信息通过每日数十亿条结合文本与图像的多模态帖子在网络平台传播,已超出人工事实核查的能力范围。有监督检测模型需要领域特定的训练数据,且难以泛化至多样化的操纵策略。本文提出MIRAGE,一种推理时、模型可插拔的智能体框架,将多模态验证分解为四个顺序模块:视觉真实性评估检测AI生成图像,跨模态一致性分析识别上下文错位再利用,检索增强事实核查通过迭代问题生成将主张锚定于网络证据,校准判断模块整合所有信号。MIRAGE通过定向网络检索协调视觉-语言模型推理,输出结构化且附带引证链接的决策依据。在MMFakeBench验证集(1,000个样本)上,搭载GPT-4o-mini的MIRAGE取得81.65%的F1分数与75.1%准确率,以7.65个百分点的优势超越最强零样本基线(GPT-4V结合MMD-Agent的74.0% F1),同时将误报率维持在34.3%(纯判断基线为97.3%)。测试集结果(5,000个样本)证实其泛化能力,达到81.44% F1分数与75.08%准确率。消融实验表明视觉验证贡献5.18个F1点,检索增强推理贡献2.97个F1点。我们的研究证明:结合网络检索的分解式智能体推理可在无需领域特定训练的情况下达到有监督检测器性能,从而在标注数据稀缺的多模态场景中实现有效的虚假信息检测。