This paper explores the elusive mechanism underpinning in-context learning in Large Language Models (LLMs). Our work provides a novel perspective by examining in-context learning via the lens of surface repetitions. We quantitatively investigate the role of surface features in text generation, and empirically establish the existence of \emph{token co-occurrence reinforcement}, a principle that strengthens the relationship between two tokens based on their contextual co-occurrences. By investigating the dual impacts of these features, our research illuminates the internal workings of in-context learning and expounds on the reasons for its failures. This paper provides an essential contribution to the understanding of in-context learning and its potential limitations, providing a fresh perspective on this exciting capability.
翻译:本文探讨了大语言模型中情境学习(In-Context Learning)的潜在机制。我们的研究通过表面重复这一视角,为理解情境学习提供了新颖的视角。我们定量分析了表面特征在文本生成中的作用,并实证确立了“共现特征强化”(token co-occurrence reinforcement)原理——该原理基于上下文中两两标记的共同出现,强化了它们之间的关联性。通过探究这些特征的双重影响,我们的研究揭示了情境学习的内部运作机制,并阐明了其失败的原因。本文为理解情境学习及其潜在局限性做出了重要贡献,为这一令人瞩目的能力提供了全新的认知视角。