Large Language Models (LLMs) often hallucinate, producing unfaithful or factually incorrect outputs by misrepresenting the provided context or incorrectly recalling internal knowledge. Recent studies have identified specific attention heads within the Transformer architecture, known as retrieval heads, responsible for extracting relevant contextual information. We hypothesise that masking these retrieval heads can induce hallucinations and that contrasting the outputs of the base LLM and the masked LLM can reduce hallucinations. To this end, we propose Decoding by Contrasting Retrieval Heads (DeCoRe), a novel training-free decoding strategy that amplifies information found in the context and model parameters. DeCoRe mitigates potentially hallucinated responses by dynamically contrasting the outputs of the base LLM and the masked LLM, using conditional entropy as a guide. Our extensive experiments confirm that DeCoRe significantly improves performance on tasks requiring high contextual faithfulness, such as summarisation (XSum by 18.6%), instruction following (MemoTrap by 10.9%), and open-book question answering (NQ-Open by 2.4% and NQ-Swap by 5.5%).
翻译:大型语言模型(LLMs)常产生幻觉,即通过曲解给定上下文或错误回忆内部知识而生成不忠实或事实错误的输出。近期研究发现在Transformer架构中存在特定的注意力头(称为检索头),负责提取相关的上下文信息。我们假设掩蔽这些检索头会诱发幻觉,而对比基础LLM与掩蔽LLM的输出可减少幻觉。为此,我们提出基于检索头对比的解码方法(DeCoRe),这是一种无需训练的新型解码策略,可增强从上下文和模型参数中提取的信息。DeCoRe通过动态对比基础LLM与掩蔽LLM的输出,并以条件熵为指导,缓解潜在的幻觉响应。我们的大量实验证实,DeCoRe在需要高上下文忠实度的任务上显著提升了性能,例如摘要(XSum提升18.6%)、指令遵循(MemoTrap提升10.9%)以及开放域问答(NQ-Open提升2.4%,NQ-Swap提升5.5%)。