When large language models encounter conflicting information in context, which memories survive -- early or recent? We adapt classical interference paradigms from cognitive psychology to answer this question, testing 39 LLMs across diverse architectures and scales. Every model shows the same pattern: proactive interference (PI) dominates retroactive interference (RI) universally (Cohen's d = 1.73, p < 0.0001), meaning early encodings are protected at the cost of recent information -- the opposite of human memory, where RI typically dominates. Three findings indicate that RI and PI reflect separate memory mechanisms. RI and PI are uncorrelated (R^2 = 0.044), rejecting a unified "memory capacity." Model size predicts RI resistance (R^2 = 0.49) but not PI (R^2 = 0.06, n.s.) -- only RI is capacity-dependent. And error analysis reveals distinct failure modes: RI failures are passive retrieval failures (51%), while PI failures show active primacy intrusion (56%); both show <1% hallucination. These patterns parallel the consolidation-retrieval distinction in cognitive science, suggesting that transformer attention creates a primacy bias with direct implications for interference-heavy applications.
翻译:当大语言模型在上下文中遇到冲突信息时,哪些记忆会留存——早期信息还是近期信息?我们借鉴认知心理学中的经典干扰范式来回答这一问题,测试了涵盖不同架构与规模的39个大语言模型。所有模型均呈现相同模式:前摄干扰普遍主导后摄干扰(科恩d值=1.73,p<0.0001),这意味着早期编码信息受到保护是以牺牲近期信息为代价的——这与人类记忆模式相反,在人类记忆中后摄干扰通常占主导地位。三项发现表明后摄干扰与前摄干扰反映了不同的记忆机制:后摄干扰与前摄干扰无相关性(R^2=0.044),否定了统一的“记忆容量”假说;模型规模可预测后摄干扰抵抗性(R^2=0.49)但与前摄干扰无关(R^2=0.06,不显著)——仅后摄干扰具有容量依赖性;错误分析揭示了不同的失效模式:后摄干扰失效主要表现为被动检索失败(51%),而前摄干扰失效则呈现主动的首因侵入(56%),两者均表现出低于1%的幻觉率。这些模式与认知科学中的巩固-检索区分机制相呼应,表明Transformer注意力机制产生了首因效应偏差,这对干扰密集型应用场景具有直接启示意义。