Large Language Models (LLMs) have proven effective at In-Context Learning (ICL), an ability that allows them to create predictors from labeled examples. Few studies have explored the interplay between ICL and specific properties of functions it attempts to approximate. In our study, we use a formal framework to explore ICL and propose a new task of approximating functions with varying number of minima. We implement a method that allows for producing functions with given inputs as minima. We find that increasing the number of minima degrades ICL performance. At the same time, our evaluation shows that ICL outperforms 2-layer Neural Network (2NN) model. Furthermore, ICL learns faster than 2NN in all settings. We validate the findings through a set of few-shot experiments across various hyperparameter configurations.
翻译:大型语言模型(LLMs)已证明在上下文学习(ICL)方面效果显著,这种能力使其能够从带标签示例中构建预测器。现有研究较少探讨ICL与其试图逼近的函数特定属性之间的相互作用。本研究采用形式化框架深入探索ICL,并提出一项新任务——逼近具有可变数量极小值的函数。我们实现了一种方法,能够生成以给定输入为极小值的函数。研究发现,增加极小值的数量会降低ICL性能。与此同时,评估结果表明,ICL在性能上优于两层神经网络(2NN)模型。此外,在所有设置下,ICL的学习速度均快于2NN。我们通过一系列不同超参数配置下的少样本实验验证了这些发现。