Previous work has showcased the intriguing capability of large language models (LLMs) in retrieving facts and processing context knowledge. However, only limited research exists on the layer-wise capability of LLMs to encode knowledge, which challenges our understanding of their internal mechanisms. In this paper, we devote the first attempt to investigate the layer-wise capability of LLMs through probing tasks. We leverage the powerful generative capability of ChatGPT to construct probing datasets, providing diverse and coherent evidence corresponding to various facts. We employ $\mathcal V$-usable information as the validation metric to better reflect the capability in encoding context knowledge across different layers. Our experiments on conflicting and newly acquired knowledge show that LLMs: (1) prefer to encode more context knowledge in the upper layers; (2) primarily encode context knowledge within knowledge-related entity tokens at lower layers while progressively expanding more knowledge within other tokens at upper layers; and (3) gradually forget the earlier context knowledge retained within the intermediate layers when provided with irrelevant evidence. Code is publicly available at https://github.com/Jometeorie/probing_llama.
翻译:先前工作揭示了大型语言模型(LLMs)在检索事实和处理上下文知识方面具有引人注目的能力。然而,关于LLMs分层编码知识能力的研究仍十分有限,这挑战了我们对它们内部机制的理解。在本文中,我们首次尝试通过探测任务来研究LLMs的分层能力。我们利用ChatGPT强大的生成能力构建探测数据集,为各种事实提供多样且连贯的证据。我们采用$\mathcal V$可用信息作为验证指标,以更好地反映不同层编码上下文知识的能力。我们在冲突知识和新获取知识上的实验表明,LLMs:(1)倾向于在上层编码更多上下文知识;(2)在下层主要将上下文知识编码在与知识相关的实体标记中,而在上层则逐步将其扩展到其他标记中;以及(3)当提供无关证据时,会逐渐遗忘中间层保留的早期上下文知识。代码已在https://github.com/Jometeorie/probing_llama开源。