In context learning (ICL) is an attractive method of solving a wide range of problems. Inspired by Garg et al. (2022), we look closely at ICL in a variety of train and test settings for several transformer models of different sizes trained from scratch. Our study complements prior work by pointing out several systematic failures of these models to generalize to data not in the training distribution, thereby showing some limitations of ICL. We find that models adopt a strategy for this task that is very different from standard solutions.
翻译:上下文学习(ICL)是解决广泛问题的一种有吸引力的方法。受Garg等人(2022)研究的启发,我们针对多个从头开始训练的不同规模Transformer模型,在多种训练和测试设置中仔细考察了ICL。我们的研究通过指出这些模型在泛化到训练分布之外数据时存在的若干系统性失败,补充了先前的工作,从而揭示了ICL的某些局限性。我们发现模型针对此任务采用的策略与标准解决方案存在显著差异。