Knowledge-intensive language understanding tasks require Language Models (LMs) to integrate relevant context, mitigating their inherent weaknesses, such as incomplete or outdated knowledge. However, conflicting knowledge can be present in the LM's parameters, termed intra-memory conflict, which can affect a model's propensity to accept contextual knowledge. To study the effect of intra-memory conflict on an LM's ability to accept relevant context, we utilize two knowledge conflict measures and a novel dataset containing inherently conflicting data, DynamicQA. This dataset includes facts with a temporal dynamic nature where facts can change over time and disputable dynamic facts, which can change depending on the viewpoint. DynamicQA is the first to include real-world knowledge conflicts and provide context to study the link between the different types of knowledge conflicts. We also evaluate several measures on their ability to reflect the presence of intra-memory conflict: semantic entropy and a novel coherent persuasion score. With our extensive experiments, we verify that LMs exhibit a greater degree of intra-memory conflict with dynamic facts compared to facts that have a single truth value. Furthermore, we reveal that facts with intra-memory conflict are harder to update with context, suggesting that retrieval-augmented generation will struggle with the most commonly adapted facts.
翻译:知识密集型语言理解任务要求语言模型(LMs)整合相关上下文,以弥补其固有的缺陷,例如知识不完整或过时。然而,冲突知识可能存在于语言模型的参数中,这被称为内部记忆冲突,它会影响模型接受上下文知识的倾向。为了研究内部记忆冲突对语言模型接受相关上下文能力的影响,我们采用了两种知识冲突度量方法以及一个包含固有冲突数据的新型数据集DynamicQA。该数据集包含具有时间动态性的事实(即事实可能随时间变化)和具有争议的动态事实(即事实可能因观点不同而变化)。DynamicQA是首个包含真实世界知识冲突并提供上下文以研究不同类型知识冲突之间关联的数据集。我们还评估了多种度量方法在反映内部记忆冲突存在性方面的能力:语义熵和一种新颖的连贯说服分数。通过大量实验,我们验证了与具有单一真值的事实相比,语言模型在动态事实上表现出更高程度的内部记忆冲突。此外,我们发现具有内部记忆冲突的事实更难通过上下文进行更新,这表明检索增强生成在处理最常见适应事实时将面临困难。