In-context learning (ICL) empowers generative models to address new tasks effectively and efficiently on the fly, without relying on any artificially crafted optimization techniques. In this paper, we study extending ICL to address a broader range of tasks with an extended learning horizon and higher improvement potential, namely General-Purpose In-Context Learning (GPICL). To this end, we introduce two lightweight benchmarks specifically crafted to train and evaluate GPICL functionalities. Each benchmark encompasses a vast number of tasks characterized by significant task variance, facilitating meta-training that minimizes inductive bias. These tasks are also crafted to promote long-horizon in-context learning through continuous generation and interaction. These characteristics necessitate the models to leverage contexts and history interactions to enhance their capabilities, across domains such as language modeling, decision-making, and world modeling. Our experiments on the baseline models demonstrate that meta-training with minimal inductive bias and ICL from the ground up is feasible across all the domains we've discussed. Additionally, our findings indicate that the scale of parameters alone may not be crucial for ICL or GPICL, suggesting alternative approaches such as increasing the scale of contexts and memory states.
翻译:情境学习(ICL)使生成模型能够在不依赖任何人工设计的优化技术的情况下,即时高效地应对新任务。本文研究将ICL扩展到更广泛的任务范围,通过延长学习周期和提升改进潜力,即通用情境学习(GPICL)。为此,我们引入了两个轻量级基准测试,专门用于训练和评估GPICL功能。每个基准测试包含大量具有显著任务差异性的任务,有助于实现最小化归纳偏差的元训练。这些任务还通过持续生成和交互促进长周期情境学习。这些特性要求模型在语言建模、决策制定和世界建模等领域中,利用上下文和历史交互来提升其能力。我们在基线模型上的实验表明,以最小归纳偏差进行元训练以及从头开始的ICL在我们讨论的所有领域都是可行的。此外,我们的研究结果表明,仅参数规模可能对ICL或GPICL并不关键,这提示了其他可能途径,例如增加上下文和记忆状态的规模。