Modern learning systems increasingly rely on amortized learning - the idea of reusing computation or inductive biases shared across tasks to enable rapid generalization to novel problems. This principle spans a range of approaches, including meta-learning, in-context learning, prompt tuning, learned optimizers and more. While motivated by similar goals, these approaches differ in how they encode and leverage task-specific information, often provided as in-context examples. In this work, we propose a unified framework which describes how such methods differ primarily in the aspects of learning they amortize - such as initializations, learned updates, or predictive mappings - and how they incorporate task data at inference. We introduce a taxonomy that categorizes amortized models into parametric, implicit, and explicit regimes, based on whether task adaptation is externalized, internalized, or jointly modeled. Building on this view, we identify a key limitation in current approaches: most methods struggle to scale to large datasets because their capacity to process task data at inference (e.g., context length) is often limited. To address this, we propose iterative amortized inference, a class of models that refine solutions step-by-step over mini-batches, drawing inspiration from stochastic optimization. Our formulation bridges optimization-based meta-learning with forward-pass amortization in models like LLMs, offering a scalable and extensible foundation for general-purpose task adaptation.
翻译:现代学习系统日益依赖摊销学习——即通过复用跨任务共享的计算或归纳偏置,实现对新问题的快速泛化。这一原理涵盖了一系列方法,包括元学习、上下文学习、提示调优、学习优化器等。尽管这些方法的目标相似,但它们在如何编码和利用任务特定信息(通常以上下文示例形式提供)方面存在差异。本研究提出一个统一框架,阐述这些方法主要在摊销的学习维度(如初始化参数、学习更新规则或预测映射)以及推断时整合任务数据的方式上存在区别。我们引入一种分类法,根据任务适应过程是外部化、内部化还是联合建模,将摊销模型划分为参数化、隐式和显式三种机制。基于此视角,我们指出当前方法的一个关键局限:多数方法难以扩展至大规模数据集,因为其在推断时处理任务数据(如上下文长度)的能力通常受限。为解决这一问题,我们提出迭代摊销推断——一类受随机优化启发的模型,能够在小批量数据上逐步优化解决方案。我们的框架将基于优化的元学习与大型语言模型等的前向传播摊销机制相连接,为通用任务适应提供了可扩展且可拓展的基础。