Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. The increasing scalability of LMs, however, poses significant challenges for understanding a model's inner workings and further for updating or correcting this embedded knowledge without the significant cost of retraining. Moreover, when using these language models for knowledge-intensive language understanding tasks, LMs have to integrate relevant context, mitigating their inherent weaknesses, such as incomplete or outdated knowledge. Nevertheless, studies indicate that LMs often ignore the provided context as it can be in conflict with the pre-existing LM's memory learned during pre-training. Conflicting knowledge can also already be present in the LM's parameters, termed intra-memory conflict. This underscores the importance of understanding the interplay between how a language model uses its parametric knowledge and the retrieved contextual knowledge. In this talk, I will aim to shed light on this important issue by presenting our research on evaluating the knowledge present in LMs, diagnostic tests that can reveal knowledge conflicts, as well as on understanding the characteristics of successfully used contextual knowledge.
翻译:语言模型(LMs)从其训练过程中获取参数化知识,并将其嵌入到其权重中。然而,语言模型日益增长的规模带来了重大挑战,使得理解模型内部运作机制以及在不付出高昂重训练成本的情况下更新或修正这些嵌入知识变得困难。此外,当将这些语言模型用于知识密集型语言理解任务时,它们必须整合相关上下文,以弥补其固有的弱点,例如知识不完整或过时。然而,研究表明,语言模型常常忽略所提供的上下文,因为它可能与预训练期间习得的、预先存在于模型记忆中的知识相冲突。冲突的知识也可能已经存在于语言模型的参数中,这被称为内部记忆冲突。这凸显了理解语言模型如何使用其参数化知识与检索到的上下文知识之间相互作用的重要性。在本次报告中,我将通过介绍我们在评估语言模型中存在的知识、揭示知识冲突的诊断测试,以及理解成功使用的上下文知识特征方面的研究,来阐明这一重要问题。