Recent developments in Language Models (LMs) have shown their effectiveness in NLP tasks, particularly in knowledge-intensive tasks. However, the mechanisms underlying knowledge storage and memory access within their parameters remain elusive. In this paper, we investigate whether a generative LM (e.g., GPT-2) is able to access its memory sequentially or randomly. Through carefully-designed synthetic tasks, covering the scenarios of full recitation, selective recitation and grounded question answering, we reveal that LMs manage to sequentially access their memory while encountering challenges in randomly accessing memorized content. We find that techniques including recitation and permutation improve the random memory access capability of LMs. Furthermore, by applying this intervention to realistic scenarios of open-domain question answering, we validate that enhancing random access by recitation leads to notable improvements in question answering. The code to reproduce our experiments can be found at https://github.com/sail-sg/lm-random-memory-access.
翻译:近期语言模型的发展已展示其在自然语言处理任务中的有效性,尤其在知识密集型任务中表现突出。然而,其参数内部知识存储与记忆访问的机制仍不明确。本文探讨生成式语言模型(如GPT-2)能否顺序或随机地访问其记忆。通过精心设计的合成任务——涵盖完整复述、选择性复述及基于事实的问答场景——我们发现语言模型能够顺序访问记忆,但在随机访问记忆内容时面临挑战。研究表明,包括复述与排列重组在内的技术能提升语言模型的随机记忆访问能力。进一步地,通过将此类干预应用于开放域问答的现实场景,我们验证了通过复述增强随机访问能力可显著提升问答性能。实验复现代码可见于 https://github.com/sail-sg/lm-random-memory-access。