Large language models (LLMs) exhibit strong in-context learning capabilities, but how they track and retrieve information from context remains underexplored. Drawing on the free recall paradigm in cognitive science (where participants recall list items in any order), we show that several open-source LLMs consistently display a serial-recall-like pattern, assigning peak probability to tokens that immediately follow a repeated token in the input sequence. Through systematic ablation experiments, we show that induction heads, specialized attention heads that attend to the token following a previous occurrence of the current token, play an important role in this phenomenon. Removing heads with a high induction score substantially reduces the +1 lag bias, whereas ablating random heads does not reproduce the same reduction. We also show that removing heads with high induction scores impairs the performance of models prompted to do serial recall using few-shot learning to a larger extent than removing random heads. Our findings highlight a mechanistically specific connection between induction heads and temporal context processing in transformers, suggesting that these heads are especially important for ordered retrieval and serial-recall-like behavior during in-context learning.
翻译:大型语言模型展现出强大的上下文学习能力,但其如何跟踪和检索上下文中的信息仍未被充分探索。借鉴认知科学中的自由回忆范式(参与者可按任意顺序回忆列表项目),我们发现多个开源大型语言模型一致表现出类似序列回忆的模式,即对输入序列中重复标记后立即出现的标记赋予峰值概率。通过系统性消融实验,我们证明归纳头(一种特殊的注意力头,会关注当前标记先前出现位置之后的后续标记)在这一现象中发挥重要作用。移除高归纳分数的注意力头会显著降低+1滞后偏差,而随机消融注意力头则不会产生相同的减弱效果。我们还发现,相较于随机移除注意力头,移除高归纳分数的注意力头会在基于少样本学习的序列回忆任务中更大程度地损害模型性能。我们的发现揭示了Transformer中归纳头与时间上下文处理之间的机制性特异性关联,表明这些注意力头对于上下文学习中的有序检索和类序列回忆行为尤为重要。