While multimodal reasoning models (MLRMs) have exhibited impressive capabilities, they remain prone to hallucinations, and effective solutions are still underexplored. In this paper, we experimentally analyze the hallucination cause and propose C3PO, a training-based mitigation framework comprising \textbf{C}hain-of-Thought \textbf{C}ompression and \textbf{C}ontrastive \textbf{P}reference \textbf{O}ptimization. Firstly, we identify that introducing reasoning mechanisms exacerbates models' reliance on language priors while overlooking visual inputs, which can produce CoTs with reduced visual cues but redundant text tokens. To this end, we propose to selectively filter redundant thinking tokens for a more compact and signal-efficient CoT representation that preserves task-relevant information while suppressing noise. In addition, we observe that the quality of the reasoning trace largely determines whether hallucination emerges in subsequent responses. To leverage this insight, we introduce a reasoning-enhanced preference tuning scheme that constructs training pairs using high-quality AI feedback. We further design a multimodal hallucination-inducing mechanism that elicits models' inherent hallucination patterns via carefully crafted inducers, yielding informative negative signals for contrastive correction. We provide theoretical justification for the effectiveness and demonstrate consistent hallucination reduction across diverse MLRMs and benchmarks.
翻译:尽管多模态推理模型(MLRMs)已展现出令人瞩目的能力,但其仍易产生幻觉,且有效的解决方案尚未得到充分探索。本文通过实验分析了幻觉产生的原因,并提出C3PO——一种基于训练的缓解框架,包含**思维链压缩**与**对比偏好优化**。首先,我们发现引入推理机制会加剧模型对语言先验的依赖,同时忽视视觉输入,这可能导致生成的思维链中视觉线索减少而文本标记冗余。为此,我们提出选择性过滤冗余的思维标记,以获得更紧凑、信号效率更高的思维链表示,在保留任务相关信息的同时抑制噪声。此外,我们观察到推理轨迹的质量在很大程度上决定了后续响应中是否会出现幻觉。基于这一发现,我们引入了一种推理增强的偏好调优方案,该方案利用高质量的人工智能反馈构建训练数据对。我们进一步设计了一种多模态幻觉诱导机制,通过精心构建的诱导因子激发模型固有的幻觉模式,从而为对比校正提供信息丰富的负向信号。我们为该方法的效果提供了理论依据,并在多种多模态推理模型和基准测试中验证了其持续降低幻觉的有效性。