Augmenting Large Language Models (LLMs) with retrieved external knowledge has proven effective for improving the factual accuracy of generated responses. Despite their success, retrieval-augmented LLMs still face the distractibility issue, where the generated responses are negatively influenced by noise from both external and internal knowledge sources. In this paper, we introduce a novel, training-free decoding method guided by entropy considerations to mitigate this issue. Our approach utilizes entropy-based document-parallel ensemble decoding to prioritize low-entropy distributions from retrieved documents, thereby enhancing the extraction of relevant information of context. Additionally, it incorporates a contrastive decoding mechanism that contrasts the obtained low-entropy ensemble distribution with the high-entropy distribution derived from the model's internal knowledge across layers, which ensures a greater emphasis on reliable external information. Extensive experiments on open-domain question answering datasets demonstrate the superiority of our method.
翻译:通过检索外部知识来增强大语言模型(LLMs)已被证明能有效提高生成回复的事实准确性。尽管取得了成功,检索增强型LLMs仍面临注意力分散问题,即生成回复会受到来自外部和内部知识源噪声的负面影响。本文提出了一种新颖的、无需训练的、基于熵指导的解码方法以缓解此问题。我们的方法利用基于熵的文档并行集成解码,优先考虑检索文档中的低熵分布,从而增强对上下文相关信息的提取。此外,该方法还结合了一种对比解码机制,将获得的低熵集成分布与模型内部知识在各层产生的高熵分布进行对比,这确保了对可靠外部信息的更大关注。在开放域问答数据集上的大量实验证明了我们方法的优越性。