When a language model is trained to predict natural language sequences, its prediction at each moment depends on a representation of prior context. What kind of information about the prior context can language models retrieve? We tested whether language models could retrieve the exact words that occurred previously in a text. In our paradigm, language models (transformers and an LSTM) processed English text in which a list of nouns occurred twice. We operationalized retrieval as the reduction in surprisal from the first to the second list. We found that the transformers retrieved both the identity and ordering of nouns from the first list. Further, the transformers' retrieval was markedly enhanced when they were trained on a larger corpus and with greater model depth. Lastly, their ability to index prior tokens was dependent on learned attention patterns. In contrast, the LSTM exhibited less precise retrieval, which was limited to list-initial tokens and to short intervening texts. The LSTM's retrieval was not sensitive to the order of nouns and it improved when the list was semantically coherent. We conclude that transformers implemented something akin to a working memory system that could flexibly retrieve individual token representations across arbitrary delays; conversely, the LSTM maintained a coarser and more rapidly-decaying semantic gist of prior tokens, weighted toward the earliest items.
翻译:当语言模型被训练用于预测自然语言序列时,其每个时刻的预测都依赖于对先前上下文的表征。我们探究了语言模型能够从先前上下文中提取何种类型的信息。通过实验范式,我们检验了语言模型是否能准确检索文本中此前出现的具体词汇。在实验中,语言模型(包括Transformer和LSTM)处理了包含两次名词列表的英文文本。我们将检索能力量化为从第一次列表到第二次列表期间模型预测困惑度降低的程度。研究发现,Transformer模型能够同时检索首次列表中名词的身份标识与排列顺序。此外,更大规模语料训练和更深层网络结构会显著增强其检索能力,而其对前序标记的索引能力取决于习得的注意力模式。相比之下,LSTM模型表现出较弱的检索精度,仅能有效处理列表起始标记和短距离文本间隔,对名词顺序不敏感,并且当列表内容具有语义连贯性时性能有所提升。我们得出结论:Transformer模型实现了类似工作记忆系统的机制,能够跨越任意时间跨度灵活检索单个标记的表征;而LSTM模型则维持了较为粗糙且快速衰减的先前标记语义梗概,其中较早出现的标记权重更高。