Large language models show an emergent ability to learn a new task from a small number of input-output demonstrations. However, recent work shows that in-context learners largely rely on their pre-trained knowledge, such as the sentiment of the labels, instead of finding new associations in the input. However, the commonly-used few-shot evaluation settings using a random selection of in-context demonstrations can not disentangle models' ability to learn a new skill from demonstrations, as most of the randomly-selected demonstrations do not present relations informative for prediction beyond exposing the new task distribution. To disentangle models' in-context learning ability independent of models' memory, we introduce a Conceptual few-shot learning method selecting the demonstrations sharing a possibly-informative concept with the predicted sample. We extract a set of such concepts from annotated explanations and measure how much can models benefit from presenting these concepts in few-shot demonstrations. We find that smaller models are more sensitive to the presented concepts. While some of the models are able to benefit from concept-presenting demonstrations for each assessed concept, we find that none of the assessed in-context learners can benefit from all presented reasoning concepts consistently, leaving the in-context concept learning an open challenge.
翻译:大型语言模型表现出从少量输入-输出示范中学习新任务的涌现能力。然而,近期研究表明,上下文学习者主要依赖其预训练知识(例如标签的情感倾向),而非在输入中寻找新的关联。但常用的基于随机选择上下文示范的小样本评估设置,无法区分模型从示范中学习新技能的能力,因为大多数随机选择的示范除暴露新任务分布外,并未呈现对预测有价值的信息关系。为解耦模型独立于记忆的上下文学习能力,我们提出了一种概念性小样本学习方法,选择与待预测样本共享潜在信息性概念的示范。我们从标注的说明中提取一组概念,并衡量模型在少样本示范中呈现这些概念时的获益程度。研究发现,较小的模型对呈现的概念更为敏感。虽然部分模型能够从每个评估概念对应的概念呈现示范中获益,但没有任何被评估的上下文学习者能够始终从所有呈现的推理概念中获益,这使得上下文概念学习仍是一个开放挑战。