Human learning is sensitive to rule-like structure and the curriculum of examples used for training. In tasks governed by succinct rules, learning is more robust when related examples are blocked across trials, but in the absence of such rules, interleaving is more effective. To date, no neural model has simultaneously captured these seemingly contradictory effects. Here we show that this same tradeoff spontaneously emerges with ``in-context learning'' (ICL) both in neural networks trained with metalearning and in large language models (LLMs). ICL is the ability to learn new tasks ``in context'' -- without weight changes -- via an inner-loop algorithm implemented in activation dynamics. Experiments with pretrained LLMs and metalearning transformers show that ICL exhibits the blocking advantage demonstrated in humans on a task involving rule-like structure, and conversely, that concurrent in-weight learning reproduces the interleaving advantage observed in humans on tasks lacking such structure.
翻译:人类学习对规则性结构及用于训练示例的课程顺序高度敏感。在受简洁规则支配的任务中,当相关示例按区块排列呈现时学习效果更鲁棒;而在缺乏此类规则的情况下,交错排列则更为有效。迄今为止,尚无神经网络模型能同时捕捉这些看似矛盾的效应。本研究表明,这种权衡机制会随"上下文学习"(ICL)在经元学习训练的神经网络和大语言模型(LLMs)中自发涌现。ICL是通过激活动态实现的内环算法,允许在无需权重更新的情况下"在上下文语境中"学习新任务。对预训练LLMs和元学习Transformer的实验表明,ICL在涉及规则结构的任务中展现出与人类相似的区块化优势;反之,在缺乏此类结构的任务中,并行发生的权重内学习则复现了人类观察到的交错排列优势。