Vocabulary Hijacking in LVLMs: Unveiling Critical Attention Heads by Excluding Inert Tokens to Mitigate Hallucination

Large Vision-Language Models (LVLMs) have achieved remarkable progress in multimodal tasks, yet their reliability is persistently undermined by hallucinations-generating text that contradicts visual input. Recent studies often attribute these errors to inadequate visual attention. In this work, we analyze the attention mechanisms via the logit lens, uncovering a distinct anomaly we term Vocabulary Hijacking. We discover that specific visual tokens, defined as Inert Tokens, disproportionately attract attention. Crucially, when their intermediate hidden states are projected into the vocabulary space, they consistently decode to a fixed set of unrelated words (termed Hijacking Anchors) across layers, revealing a rigid semantic collapse. Leveraging this semantic rigidity, we propose Hijacking Anchor-Based Identification (HABI), a robust strategy to accurately localize these Inert Tokens. To quantify the impact of this phenomenon, we introduce the Non-Hijacked Visual Attention Ratio (NHAR), a novel metric designed to identify attention heads that remain resilient to hijacking and are critical for factual accuracy. Building on these insights, we propose Hijacking-Aware Visual Attention Enhancement (HAVAE), a training-free intervention that selectively strengthens the focus of these identified heads on salient visual content. Extensive experiments across multiple benchmarks demonstrate that HAVAE significantly mitigates hallucinations with no additional computational overhead, while preserving the model's general capabilities. Our code is publicly available at https://github.com/lab-klc/HAVAE.

翻译：大型视觉语言模型在多模态任务中取得了显著进展，但其可靠性持续受到生成与视觉输入矛盾的文本（即幻觉）的威胁。近期研究常将这些错误归因于视觉注意力不足。本文通过logit透镜机制分析注意力机制，发现一种我们称为"词汇劫持"的显著异常。我们发现特定视觉Token（定义为惰性Token）会过度吸引注意力。关键在于，当其中间隐藏状态投影到词汇空间时，它们会跨层持续解码为固定的无关词集（称为"劫持锚点"），揭示出僵化的语义坍缩。利用这种语义刚性，我们提出基于劫持锚点的识别策略，这是一种精确定位这些惰性Token的稳健方法。为量化该现象的影响，我们引入非劫持视觉注意力比这一新指标，旨在识别对劫持具有抵抗力且对事实准确性至关重要的注意力头。基于这些洞见，我们提出劫持感知视觉注意力增强方法，这是一种免训练干预手段，可选择性地增强这些被识别注意力头对显著视觉内容的聚焦程度。在多个基准上的大量实验表明，HAVAE在不增加计算开销的情况下显著缓解了幻觉，同时保持模型的通用能力。我们的代码已开源在https://github.com/lab-klc/HAVAE。