Retrieval-augmented language models (RALMs) have demonstrated significant potential in refining and expanding their internal memory by retrieving evidence from external sources. However, RALMs will inevitably encounter knowledge conflicts when integrating their internal memory with external sources. Knowledge conflicts can ensnare RALMs in a tug-of-war between knowledge, limiting their practical applicability. In this paper, we focus on exploring and resolving knowledge conflicts in RALMs. First, we present an evaluation framework for assessing knowledge conflicts across various dimensions. Then, we investigate the behavior and preference of RALMs from the following two perspectives: (1) Conflicts between internal memory and external sources: We find that stronger RALMs emerge with the Dunning-Kruger effect, persistently favoring their faulty internal memory even when correct evidence is provided. Besides, RALMs exhibit an availability bias towards common knowledge; (2) Conflicts between truthful, irrelevant and misleading evidence: We reveal that RALMs follow the principle of majority rule, leaning towards placing trust in evidence that appears more frequently. Moreover, we find that RALMs exhibit confirmation bias, and are more willing to choose evidence that is consistent with their internal memory. To solve the challenge of knowledge conflicts, we propose a method called Conflict-Disentangle Contrastive Decoding (CD2) to better calibrate the model's confidence. Experimental results demonstrate that our CD2 can effectively resolve knowledge conflicts in RALMs.
翻译:检索增强语言模型(RALMs)通过从外部来源检索证据,在优化和扩展其内部记忆方面展现出显著潜力。然而,当整合内部记忆与外部来源时,RALMs 不可避免地会遇到知识冲突。知识冲突可能使 RALMs 陷入知识拔河状态,限制其实际应用价值。本文聚焦于探索和解决 RALMs 中的知识冲突。首先,我们提出一个从多维度评估知识冲突的框架。随后,我们从以下两个角度研究 RALMs 的行为与偏好:(1)内部记忆与外部来源之间的冲突:我们发现,较强的 RALMs 会出现达克效应,即便提供正确证据,仍持续偏好其错误内部记忆。此外,RALMs 对常识知识存在可得性偏差;(2)真实、无关与误导性证据之间的冲突:我们揭示 RALMs 遵循多数原则,倾向于信任出现频率更高的证据。同时,我们发现 RALMs 存在确认偏差,更愿意选择与内部记忆一致的证据。为解决知识冲突的挑战,我们提出一种名为冲突解耦对比解码(CD2)的方法,以更好地校准模型置信度。实验结果表明,我们的 CD2 方法能有效解决 RALMs 中的知识冲突。