Is In-Context Learning Learning? - 专知论文

会员服务 ·

0

示例 · 上下文 · 上下文学习 · 泛化 · 准确率 ·

Is In-Context Learning Learning?

翻译：上下文学习是学习吗？

Adrian de Wynter

from arxiv, Accepted to ICLR 2026 -- CR version

In-context learning (ICL) allows some autoregressive models to solve tasks via next-token prediction and without needing further training. This has led to claims about these model's ability to solve (learn) unseen tasks with only a few shots (exemplars) in the prompt. However, deduction does not always imply learning, as ICL does not explicitly encode a given observation. Instead, the models rely on their prior knowledge and the exemplars given, if any. We argue that, mathematically, ICL fits the definition of learning; however, its full characterisation requires empirical work. We then carry out a large-scale analysis of ICL ablating out or accounting for memorisation, pretraining, distributional shifts, and prompting style and phrasing. We find that, empirically, ICL is limited in its ability to learn and generalise to unseen tasks. Namely, in the limit where exemplars become more numerous, accuracy is insensitive to exemplar distribution, model, prompt style, and the input's linguistic features. Instead, it deduces patterns from regularities in the prompt, which leads to distributional sensitivity, especially in prompting styles such as chain-of-thought. Given the varied accuracies and on formally similar tasks, we conclude that autoregression's ad-hoc encoding is not a robust mechanism for learning, and suggests limited all-purpose generalisability.

翻译：上下文学习（ICL）使得某些自回归模型能够通过下一个词预测来解决任务，而无需进一步训练。这引发了关于这些模型仅通过提示中的少量示例（样本）就能解决（学习）未见任务能力的论断。然而，演绎并不总是意味着学习，因为ICL并未显式编码给定的观察结果。相反，模型依赖于其先验知识和所提供的示例（如果有的话）。我们认为，从数学角度而言，ICL符合学习的定义；然而，其完整特性需要实证研究来阐明。我们随后对ICL进行了大规模分析，通过消融实验或控制变量方法考察了记忆效应、预训练、分布偏移以及提示风格与措辞的影响。我们发现，从实证角度看，ICL在学习及泛化至未见任务方面的能力存在局限。具体而言，当示例数量趋于极限时，模型准确率对示例分布、模型架构、提示风格及输入文本的语言学特征均不敏感。相反，模型通过从提示中的规律性进行模式推演，这导致了分布敏感性——在思维链等提示风格中尤为明显。鉴于在形式相似任务中观察到的准确率差异，我们得出结论：自回归机制的特设编码并非稳健的学习机制，这表明其通用泛化能力存在局限。

0

相关内容

【MIT博士论文】语言模型的推理时学习算法

【MIT博士论文】语言模型的推理时学习算法

专知会员服务

30+阅读 · 2025年12月24日

【博士论文】基于多模态基础模型的上下文学习

【博士论文】基于多模态基础模型的上下文学习

专知会员服务

23+阅读 · 2025年12月17日

什么是上下文工程？中科院计算所等《大语言模型的上下文工程》综述

什么是上下文工程？中科院计算所等《大语言模型的上下文工程》综述

专知会员服务

43+阅读 · 2025年7月18日

谷歌等最新《使用检索示例的大语言模型上下文学习》综述

谷歌等最新《使用检索示例的大语言模型上下文学习》综述

专知会员服务

57+阅读 · 2024年1月24日

现在大火的“In-context Learning”是什么？北大等最新《语境学习ICL》综述论文，详述ICL进展、挑战和方向

现在大火的“In-context Learning”是什么？北大等最新《语境学习ICL》综述论文，详述ICL进展、挑战和方向

专知会员服务

41+阅读 · 2023年1月3日

【CMU博士论文】多视图上下文理解的知识增强表示学习

【CMU博士论文】多视图上下文理解的知识增强表示学习

专知会员服务

35+阅读 · 2022年8月11日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

40+阅读 · 2020年4月17日

【AAAI2020-清华-百度】学习医学文本的概念-上下文嵌入，Learning Conceptual-Contextual Embeddings for Medical Text

【AAAI2020-清华-百度】学习医学文本的概念-上下文嵌入，Learning Conceptual-Contextual Embeddings for Medical Text

专知会员服务

38+阅读 · 2020年3月14日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

最新《联邦学习Federated Learning》报告，47页ppt

最新《联邦学习Federated Learning》报告，47页ppt

专知

48+阅读 · 2020年12月2日

【ICML2020-Tutorial】无标签表示学习，222页ppt，DeepMind

【ICML2020-Tutorial】无标签表示学习，222页ppt，DeepMind

专知

30+阅读 · 2020年7月14日

浅谈主动学习（Active Learning）

浅谈主动学习（Active Learning）

凡人机器学习

32+阅读 · 2020年6月18日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

专知

36+阅读 · 2019年9月29日

元学习（Meta Learning）最全论文、视频、书籍资源整理

元学习（Meta Learning）最全论文、视频、书籍资源整理

深度学习与NLP

22+阅读 · 2019年6月20日

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

专知

363+阅读 · 2019年4月12日

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

机器之心

15+阅读 · 2019年3月18日

图像和文本的融合表示学习——Text2Image和Image2Text

图像和文本的融合表示学习——Text2Image和Image2Text

专知

125+阅读 · 2018年6月11日

复杂环境下机器学习的理论研究

国家自然科学基金

21+阅读 · 2015年12月31日

基于视觉上下文与文字显著性的复杂自然场景中文字检测研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于多样化查询的多标记主动学习研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向大规模多步学习问题的学习分类元系统技术研究

国家自然科学基金

5+阅读 · 2015年12月31日

面向异分布数据的主动学习方法

国家自然科学基金

12+阅读 · 2015年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

面向汉语文本理解的语义计算方法

国家自然科学基金

8+阅读 · 2014年12月31日

Learning When to Attend: Conditional Memory Access for Long-Context LLMs

Arxiv

0+阅读 · 3月18日

Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

Arxiv

0+阅读 · 2月26日

RDBLearn: Simple In-Context Prediction Over Relational Databases

Arxiv

0+阅读 · 2月14日

Meta-Sel: Efficient Demonstration Selection for In-Context Learning via Supervised Meta-Learning

Arxiv

0+阅读 · 2月12日

In-Context Function Learning in Large Language Models

Arxiv

0+阅读 · 2月12日

Context-level Language Modeling by Learning Predictive Context Embeddings

Arxiv

0+阅读 · 2月11日

In-Context Learning Without Copying

Arxiv

0+阅读 · 2月10日

Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

Arxiv

0+阅读 · 2月9日

How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs

Arxiv

0+阅读 · 2月5日

Counting Hypothesis: Potential Mechanism of In-Context Learning

Arxiv

0+阅读 · 2月2日

VIP会员

文章信息

相关主题

上下文学习

最新内容

2025年大语言模型进展报告

2025年大语言模型进展报告

专知会员服务

1+阅读 · 今天13:30

多智能体协作机制

多智能体协作机制

专知会员服务

1+阅读 · 今天13:26

非对称优势：美海军开发低成本反无人机技术

非对称优势：美海军开发低成本反无人机技术

专知会员服务

4+阅读 · 今天4:39

《反无人机技术领域的技术发展综述：C-UAS探测、跟踪与识别技术》80页报告

《反无人机技术领域的技术发展综述：C-UAS探测、跟踪与识别技术》80页报告

专知会员服务

14+阅读 · 今天2:52

《美战争部小企业创新研究（SBIR）计划》

《美战争部小企业创新研究（SBIR）计划》

专知会员服务

6+阅读 · 今天2:48

《军事模拟：将军事条令与目标融入AI智能体》

《军事模拟：将军事条令与目标融入AI智能体》

专知会员服务

9+阅读 · 今天2:43

【NTU博士论文】3D人体动作生成

【NTU博士论文】3D人体动作生成

专知会员服务

7+阅读 · 4月24日

DeepSeek-V4：百万 Token 上下文背后，大模型正在进入“长程智能”时代（附中英文pdf版）

DeepSeek-V4：百万 Token 上下文背后，大模型正在进入“长程智能”时代（附中英文pdf版）

专知会员服务

9+阅读 · 4月24日

以色列军事技术对美国军力发展的持续性赋能

以色列军事技术对美国军力发展的持续性赋能

专知会员服务

8+阅读 · 4月24日

战场之外的较量：美伊冲突中的认知战与心理博弈

战场之外的较量：美伊冲突中的认知战与心理博弈

专知会员服务

6+阅读 · 4月24日

俄乌战争中乌克兰防空能力演变与见解（中文版）

俄乌战争中乌克兰防空能力演变与见解（中文版）

专知会员服务

7+阅读 · 4月24日

《面向巡飞弹药系统的情境感知深度强化学习自主非线性机动控制》

《面向巡飞弹药系统的情境感知深度强化学习自主非线性机动控制》

专知会员服务

10+阅读 · 4月24日

《深度强化学习在兵棋推演中的应用》40页报告

《深度强化学习在兵棋推演中的应用》40页报告

专知会员服务

14+阅读 · 4月24日

《多域作战面临复杂现实》

《多域作战面临复杂现实》

专知会员服务

10+阅读 · 4月24日

《印度的多域作战：条令与能力发展》报告

《印度的多域作战：条令与能力发展》报告

专知会员服务

5+阅读 · 4月24日

相关VIP内容

【MIT博士论文】语言模型的推理时学习算法

【MIT博士论文】语言模型的推理时学习算法

专知会员服务

30+阅读 · 2025年12月24日

【博士论文】基于多模态基础模型的上下文学习

【博士论文】基于多模态基础模型的上下文学习

专知会员服务

23+阅读 · 2025年12月17日

什么是上下文工程？中科院计算所等《大语言模型的上下文工程》综述

什么是上下文工程？中科院计算所等《大语言模型的上下文工程》综述

专知会员服务

43+阅读 · 2025年7月18日

谷歌等最新《使用检索示例的大语言模型上下文学习》综述

谷歌等最新《使用检索示例的大语言模型上下文学习》综述

专知会员服务

57+阅读 · 2024年1月24日

现在大火的“In-context Learning”是什么？北大等最新《语境学习ICL》综述论文，详述ICL进展、挑战和方向

现在大火的“In-context Learning”是什么？北大等最新《语境学习ICL》综述论文，详述ICL进展、挑战和方向

专知会员服务

41+阅读 · 2023年1月3日

【CMU博士论文】多视图上下文理解的知识增强表示学习

【CMU博士论文】多视图上下文理解的知识增强表示学习

专知会员服务

35+阅读 · 2022年8月11日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

【微软亚研】预训练文本表示作为元学习，Pre-training Text Representations

专知会员服务

40+阅读 · 2020年4月17日

【AAAI2020-清华-百度】学习医学文本的概念-上下文嵌入，Learning Conceptual-Contextual Embeddings for Medical Text

【AAAI2020-清华-百度】学习医学文本的概念-上下文嵌入，Learning Conceptual-Contextual Embeddings for Medical Text

专知会员服务

38+阅读 · 2020年3月14日

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

【WWW2020】学习上下文化文档表示用于医疗答案检索，Learning Contextualized Document Representations for Healthcare Answer Retrieval

专知会员服务

26+阅读 · 2020年2月10日

热门VIP内容

开通专知VIP会员享更多权益服务

多智能体协作机制

《反无人机技术领域的技术发展综述：C-UAS探测、跟踪与识别技术》80页报告

2025年大语言模型进展报告

非对称优势：美海军开发低成本反无人机技术

相关资讯

最新《联邦学习Federated Learning》报告，47页ppt

最新《联邦学习Federated Learning》报告，47页ppt

专知

48+阅读 · 2020年12月2日

【ICML2020-Tutorial】无标签表示学习，222页ppt，DeepMind

【ICML2020-Tutorial】无标签表示学习，222页ppt，DeepMind

专知

30+阅读 · 2020年7月14日

浅谈主动学习（Active Learning）

浅谈主动学习（Active Learning）

凡人机器学习

32+阅读 · 2020年6月18日

【Uber AI新论文】持续元学习，Learning to Continually Learn

【Uber AI新论文】持续元学习，Learning to Continually Learn

专知

19+阅读 · 2020年2月27日

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

【加州理工】什么是模仿学习(Imitation Learning（模仿学习), 这62页ppt带你了解进展，附下载

专知

21+阅读 · 2019年11月14日

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

专知

36+阅读 · 2019年9月29日

元学习（Meta Learning）最全论文、视频、书籍资源整理

元学习（Meta Learning）最全论文、视频、书籍资源整理

深度学习与NLP

22+阅读 · 2019年6月20日

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

专知

363+阅读 · 2019年4月12日

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

BAM！利用知识蒸馏和多任务学习构建的通用语言模型

机器之心

15+阅读 · 2019年3月18日

图像和文本的融合表示学习——Text2Image和Image2Text

图像和文本的融合表示学习——Text2Image和Image2Text

专知

125+阅读 · 2018年6月11日

相关论文

Learning When to Attend: Conditional Memory Access for Long-Context LLMs

Arxiv

0+阅读 · 3月18日

Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

Arxiv

0+阅读 · 2月26日

RDBLearn: Simple In-Context Prediction Over Relational Databases

Arxiv

0+阅读 · 2月14日

Meta-Sel: Efficient Demonstration Selection for In-Context Learning via Supervised Meta-Learning

Arxiv

0+阅读 · 2月12日

In-Context Function Learning in Large Language Models

Arxiv

0+阅读 · 2月12日

Context-level Language Modeling by Learning Predictive Context Embeddings

Arxiv

0+阅读 · 2月11日

In-Context Learning Without Copying

Arxiv

0+阅读 · 2月10日

Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

Arxiv

0+阅读 · 2月9日

How Data Mixing Shapes In-Context Learning: Asymptotic Equivalence for Transformers with MLPs

Arxiv

0+阅读 · 2月5日

Counting Hypothesis: Potential Mechanism of In-Context Learning

Arxiv

0+阅读 · 2月2日

相关基金

复杂环境下机器学习的理论研究

国家自然科学基金

21+阅读 · 2015年12月31日

基于视觉上下文与文字显著性的复杂自然场景中文字检测研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于多样化查询的多标记主动学习研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于生态演替的文本大数据特征学习研究

国家自然科学基金

1+阅读 · 2015年12月31日

强调与对比影响语篇理解的认知过程及其神经机制

国家自然科学基金

4+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向大规模多步学习问题的学习分类元系统技术研究

国家自然科学基金

5+阅读 · 2015年12月31日

面向异分布数据的主动学习方法

国家自然科学基金

12+阅读 · 2015年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

面向汉语文本理解的语义计算方法

国家自然科学基金

8+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员