Large language models (LLMs) have had a huge impact on society due to their impressive capabilities and vast knowledge of the world. Various applications and tools have been created that allow users to interact with these models in a black-box scenario. However, one limitation of this scenario is that users cannot modify the internal knowledge of the model, and the only way to add or modify internal knowledge is by explicitly mentioning it to the model during the current interaction. This learning process is called in-context training, and it refers to training that is confined to the user's current session or context. In-context learning has significant applications, but also has limitations that are seldom studied. In this paper, we present a study that shows how the model can suffer from interference between information that continually flows in the context, causing it to forget previously learned knowledge, which can reduce the model's performance. Along with showing the problem, we propose an evaluation benchmark based on the bAbI dataset.
翻译:大型语言模型(LLMs)凭借其卓越的能力和丰富的世界知识,对社会产生了巨大影响。各类应用和工具已被创建,使用户能够在黑箱场景中与这些模型进行交互。然而,这种场景的一个局限性在于用户无法修改模型的内部知识,而添加或修改内部知识的唯一方式是在当前交互中明确地向模型提及相关信息。这种学习过程被称为上下文训练,它指的是局限于用户当前会话或上下文中的训练。上下文学习具有重要应用,但其局限性却鲜有研究。本文提出一项研究,展示模型如何因上下文中持续涌入的信息之间的干扰而受到影响,导致其遗忘先前学到的知识,从而降低模型性能。在揭示该问题的同时,我们基于bAbI数据集提出了一种评估基准。