Large Language Models (LLMs) have proven effective at In-Context Learning (ICL), an ability that allows them to create predictors from labeled examples. Few studies have explored the interplay between ICL and specific properties of functions it attempts to approximate. In our study, we use a formal framework to explore ICL and propose a new task of approximating functions with varying number of minima. We implement a method that allows for producing functions with given inputs as minima. We find that increasing the number of minima degrades ICL performance. At the same time, our evaluation shows that ICL outperforms 2-layer Neural Network (2NN) model. Furthermore, ICL learns faster than 2NN in all settings. We validate the findings through a set of few-shot experiments across various hyperparameter configurations.
翻译:大型语言模型(LLMs)已被证明在上下文学习(ICL)方面表现有效,这种能力使其能够通过标注示例生成预测器。目前仅有少量研究探讨了ICL与其尝试逼近的函数特定属性之间的相互作用。在本研究中,我们采用形式化框架探索ICL,并提出一项新任务:逼近具有可变数量极小值的函数。我们实现了一种方法,能够生成具有给定输入作为极小值的函数。研究发现,增加极小值的数量会降低ICL的性能。同时,评估结果表明,ICL的性能优于双层神经网络(2NN)模型。此外,在所有设置下,ICL的学习速度均快于2NN。我们通过在多种超参数配置下进行的一系列几样本实验验证了这些发现。