Recent developments in Language Models (LMs) have shown their effectiveness in NLP tasks, particularly in knowledge-intensive tasks. However, the mechanisms underlying knowledge storage and memory access within their parameters remain elusive. In this paper, we investigate whether a generative LM (e.g., GPT-2) is able to access its memory sequentially or randomly. Through carefully-designed synthetic tasks, covering the scenarios of full recitation, selective recitation and grounded question answering, we reveal that LMs manage to sequentially access their memory while encountering challenges in randomly accessing memorized content. We find that techniques including recitation and permutation improve the random memory access capability of LMs. Furthermore, by applying this intervention to realistic scenarios of open-domain question answering, we validate that enhancing random access by recitation leads to notable improvements in question answering. The code to reproduce our experiments can be found at https://github. com/sail-sg/lm-random-memory-access.
翻译:近年来,语言模型(LM)的发展展示了其在自然语言处理任务中的有效性,尤其是在知识密集型任务中。然而,其参数内部知识存储与内存访问的底层机制仍然难以捉摸。本文探究生成式语言模型(例如GPT-2)能否顺序或随机地访问其内存。通过精心设计的合成任务(覆盖全文背诵、选择性背诵以及基于事实的问答场景),我们揭示了语言模型能够顺序访问其内存,但在随机访问已记忆内容时面临挑战。研究发现,包括背诵和置换在内的技术可以提升语言模型的随机内存访问能力。此外,通过将这一干预应用于开放域问答的真实场景,我们验证了通过背诵增强随机访问能显著提升问答性能。重现实验的代码可在 https://github.com/sail-sg/lm-random-memory-access 获取。