Large language models (LLMs) often encounter knowledge conflicts, scenarios where discrepancy arises between the internal parametric knowledge of LLMs and non-parametric information provided in the prompt context. In this work we ask what are the desiderata for LLMs when a knowledge conflict arises and whether existing LLMs fulfill them. We posit that LLMs should 1) identify knowledge conflicts, 2) pinpoint conflicting information segments, and 3) provide distinct answers or viewpoints in conflicting scenarios. To this end, we introduce KNOWLEDGE CONFLICT, an evaluation framework for simulating contextual knowledge conflicts and quantitatively evaluating to what extent LLMs achieve these goals. KNOWLEDGE CONFLICT includes diverse and complex situations of knowledge conflict, knowledge from diverse entities and domains, two synthetic conflict creation methods, and settings with progressively increasing difficulty to reflect realistic knowledge conflicts. Extensive experiments with the KNOWLEDGE CONFLICT framework reveal that while LLMs perform well in identifying the existence of knowledge conflicts, they struggle to determine the specific conflicting knowledge and produce a response with distinct answers amidst conflicting information. To address these challenges, we propose new instruction-based approaches that augment LLMs to better achieve the three goals. Further analysis shows that abilities to tackle knowledge conflicts are greatly impacted by factors such as knowledge domain and prompt text, while generating robust responses to knowledge conflict scenarios remains an open research question.
翻译:大型语言模型(LLMs)常面临知识冲突场景,即模型内部参数化知识与提示上下文中提供的非参数化信息之间出现不一致。本研究旨在探讨:当知识冲突发生时,LLMs应满足哪些预期要求?现有模型是否已实现这些要求?我们提出LLMs应具备以下能力:1)识别知识冲突;2)定位冲突信息片段;3)在冲突场景中提供明确的答案或观点。为此,我们提出KNOWLEDGE CONFLICT评估框架,通过模拟上下文知识冲突并定量评估LLMs实现上述目标的程度。该框架涵盖多样且复杂的知识冲突情境,包含多领域实体知识、两种合成冲突生成方法,以及逐级递增的难度设置以反映真实知识冲突。基于KNOWLEDGE CONFLICT框架的大规模实验表明:虽然LLMs在识别知识冲突存在性方面表现良好,但在确定具体冲突知识、以及在矛盾信息中生成具有明确区分的答案方面仍存在困难。针对这些挑战,我们提出基于指令的新方法以增强LLMs实现上述三项目标的能力。进一步分析表明,处理知识冲突的能力受知识领域和提示文本等因素显著影响,而生成对知识冲突场景的稳健响应仍是待解决的研究问题。