Large Language Models (LLMs) often generate hallucinations, producing outputs that are contextually inaccurate or factually incorrect. We introduce HICD, a novel method designed to induce hallucinations for contrastive decoding to mitigate hallucinations. Unlike existing contrastive decoding methods, HICD selects attention heads crucial to the model's prediction as inducing heads, then induces hallucinations by dispersing attention of these inducing heads and compares the hallucinated outputs with the original outputs to obtain the final result. Our approach significantly improves performance on tasks requiring contextual faithfulness, such as context completion, reading comprehension, and question answering. It also improves factuality in tasks requiring accurate knowledge recall. We demonstrate that our inducing heads selection and attention dispersion method leads to more "contrast-effective" hallucinations for contrastive decoding, outperforming other hallucination-inducing methods. Our findings provide a promising strategy for reducing hallucinations by inducing hallucinations in a controlled manner, enhancing the performance of LLMs in a wide range of tasks.
翻译:大语言模型(LLM)经常产生幻觉,生成上下文不准确或事实错误的输出。我们提出了HICD,一种旨在诱导幻觉以进行对比解码从而减轻幻觉的新方法。与现有的对比解码方法不同,HICD选择对模型预测至关重要的注意力头作为诱导头,然后通过分散这些诱导头的注意力来诱导幻觉,并将诱导产生的幻觉输出与原始输出进行比较以获得最终结果。我们的方法在需要上下文忠实性的任务(如上下文补全、阅读理解、问答)上显著提升了性能。它同时也提高了需要精确知识回忆的任务的事实准确性。我们证明了我们的诱导头选择与注意力分散方法能为对比解码产生更具“对比有效性”的幻觉,其效果优于其他幻觉诱导方法。我们的研究结果为通过受控方式诱导幻觉来减少幻觉提供了一种有前景的策略,从而提升LLM在广泛任务中的表现。