In recent years, pre-trained large language models have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning. However, existing literature has highlighted the sensitivity of this capability to the selection of few-shot demonstrations. The underlying mechanisms by which this capability arises from regular language model pretraining objectives remain poorly understood. In this study, we aim to examine the in-context learning phenomenon through a Bayesian lens, viewing large language models as topic models that implicitly infer task-related information from demonstrations. On this premise, we propose an algorithm for selecting optimal demonstrations from a set of annotated data and demonstrate a significant 12.5% improvement relative to the random selection baseline, averaged over eight GPT2 and GPT3 models on eight different real-world text classification datasets. Our empirical findings support our hypothesis that large language models implicitly infer a latent concept variable.
翻译:近年来,预训练大型语言模型展现出显著的效率,能够实现一种名为情境学习的推理时少样本学习能力。然而,现有文献强调,这种能力对少样本示例的选择具有敏感性。目前,关于这种能力如何源于常规语言模型预训练目标的潜在机制仍知之甚少。在本研究中,我们旨在通过贝叶斯视角审视情境学习现象,将大型语言模型视为主题模型,它们隐式地从示例中推断与任务相关的信息。基于这一前提,我们提出了一种从标注数据集中选择最优示例的算法,并在八个不同的GPT2和GPT3模型上,针对八个真实世界文本分类数据集平均显示出相对于随机选择基准12.5%的显著提升。我们的实证结果支持了大型语言模型隐式推断潜在概念变量的假设。