In-context learning (ICL) is a powerful ability that emerges in transformer models, enabling them to learn from context without weight updates. Recent work has established emergent ICL as a transient phenomenon that can sometimes disappear after long training times. In this work, we sought a mechanistic understanding of these transient dynamics. Firstly, we find that, after the disappearance of ICL, the asymptotic strategy is a remarkable hybrid between in-weights and in-context learning, which we term "context-constrained in-weights learning" (CIWL). CIWL is in competition with ICL, and eventually replaces it as the dominant strategy of the model (thus leading to ICL transience). However, we also find that the two competing strategies actually share sub-circuits, which gives rise to cooperative dynamics as well. For example, in our setup, ICL is unable to emerge quickly on its own, and can only be enabled through the simultaneous slow development of asymptotic CIWL. CIWL thus both cooperates and competes with ICL, a phenomenon we term "strategy coopetition." We propose a minimal mathematical model that reproduces these key dynamics and interactions. Informed by this model, we were able to identify a setup where ICL is truly emergent and persistent.
翻译:上下文学习(ICL)是Transformer模型涌现出的一种强大能力,使其能够在不更新权重的情况下从上下文中学习。近期研究指出,涌现的ICCL是一种瞬态现象,有时会在长时间训练后消失。本研究旨在从机制上理解这种瞬态动力学。首先,我们发现ICL消失后的渐近策略是一种介于权重学习与上下文学习之间的特殊混合模式,我们称之为“上下文约束的权重学习”(CIWL)。CIWL与ICL存在竞争关系,并最终取代ICL成为模型的主导策略(从而导致ICL的瞬态性)。然而,我们也发现这两种竞争策略实际上共享部分子电路,这同时引发了合作动力学。例如,在我们的实验设置中,ICL无法独立快速涌现,必须依赖渐近CIWL的同步缓慢发展才能实现。因此,CIWL与ICL既合作又竞争,我们将这种现象称为“策略竞合”。我们提出了一个能够复现这些关键动力学与相互作用的最小数学模型。基于该模型的启示,我们成功构建出ICL真正持续涌现的实验环境。