Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability, where the model learns to do an unseen task via a prompt consisting of input-output examples as the demonstration, without any parameter updates. The performance of ICL is highly dominated by the quality of the selected in-context examples. However, previous selection methods are mostly based on simple heuristics, leading to sub-optimal performance. In this work, we formulate in-context example selection as a subset selection problem. We propose CEIL (Compositional Exemplars for In-context Learning), which is instantiated by Determinantal Point Processes (DPPs) to model the interaction between the given input and in-context examples, and optimized through a carefully-designed contrastive learning objective to obtain preference from LMs. We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing. Extensive experiments demonstrate not only the state-of-the-art performance but also the transferability and compositionality of CEIL, shedding new light on effective and efficient in-context learning. Our code is released at https://github.com/HKUNLP/icl-ceil.
翻译:大型预训练语言模型(LMs)展现出令人瞩目的上下文学习(ICL)能力,即模型无需任何参数更新,通过由输入-输出示例组成的提示(作为演示)来学习执行未见过的任务。ICL的性能高度依赖于所选上下文示例的质量。然而,先前的选择方法大多基于简单启发式规则,导致性能次优。在本工作中,我们将上下文示例选择问题形式化为一个子集选择问题。我们提出CEIL(组合式示例用于上下文学习),该方法通过行列式点过程(DPPs)实例化,以建模给定输入与上下文示例之间的交互,并通过精心设计的对比学习目标进行优化,从而从语言模型中获得偏好。我们在来自7个不同NLP任务的12个分类与生成数据集上验证了CEIL,这些任务包括情感分析、释义检测、自然语言推理、常识推理、开放域问答、代码生成和语义解析。大量实验不仅证明了CEIL的最优性能,还展示了其可迁移性和组合性,为有效且高效的上下文学习提供了新见解。我们的代码已发布在https://github.com/HKUNLP/icl-ceil。