OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

Hallucination, posed as a pervasive challenge of multi-modal large language models (MLLMs), has significantly impeded their real-world usage that demands precise judgment. Existing methods mitigate this issue with either training with specific designed data or inferencing with external knowledge from other sources, incurring inevitable additional costs. In this paper, we present OPERA, a novel MLLM decoding method grounded in an Over-trust Penalty and a Retrospection-Allocation strategy, serving as a nearly free lunch to alleviate the hallucination issue without additional data, knowledge, or training. Our approach begins with an interesting observation that, most hallucinations are closely tied to the knowledge aggregation patterns manifested in the self-attention matrix, i.e., MLLMs tend to generate new tokens by focusing on a few summary tokens, but not all the previous tokens. Such partial over-trust inclination results in the neglecting of image tokens and describes the image content with hallucination. Statistically, we observe an 80%$\sim$95% co-currency rate between hallucination contents and such knowledge aggregation patterns. Based on the observation, OPERA introduces a penalty term on the model logits during the beam-search decoding to mitigate the over-trust issue, along with a rollback strategy that retrospects the presence of summary tokens in the previously generated tokens, and re-allocate the token selection if necessary. With extensive experiments, OPERA shows significant hallucination-mitigating performance on different MLLMs and metrics, proving its effectiveness and generality. Our code is available at: https://github.com/shikiw/OPERA.

翻译：幻觉作为多模态大语言模型（MLLMs）中普遍存在的挑战，严重阻碍了其对需要精确判断的现实世界应用。现有方法或通过训练特定设计数据、或借助外部知识进行推理来缓解这一问题，但不可避免地增加了额外成本。本文提出OPERA——一种基于过度信任惩罚与回溯分配策略的新型MLLM解码方法，该方法几乎无需额外数据、知识或训练，即可作为缓解幻觉问题的“免费午餐”。我们的方法源于一个有趣的发现：大多数幻觉与自注意力矩阵中呈现的知识聚集模式密切相关——即MLLM倾向于通过聚焦少量摘要标记而非所有历史标记来生成新标记。这种部分过度信任倾向导致图像标记被忽视，进而产生描述图像内容时的幻觉。统计显示，幻觉内容与这种知识聚集模式之间的共现率高达80%∼95%。基于此观察，OPERA在波束搜索解码过程中对模型logits引入惩罚项以缓解过度信任问题，同时结合回溯策略：对先前生成标记中摘要标记的存在性进行回溯，并在必要时重新分配标记选择。通过大量实验，OPERA在不同MLLM和评估指标下均展现出显著的幻觉缓解性能，验证了其有效性和普适性。我们的代码已开源至：https://github.com/shikiw/OPERA。