Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability, where the model learns to do an unseen task via a prompt consisting of input-output examples as the demonstration, without any parameter updates. The performance of ICL is highly dominated by the quality of the selected in-context examples. However, previous selection methods are mostly based on simple heuristics, leading to sub-optimal performance. In this work, we formulate in-context example selection as a subset selection problem. We propose CEIL(Compositional Exemplars for In-context Learning), which is instantiated by Determinantal Point Processes (DPPs) to model the interaction between the given input and in-context examples, and optimized through a carefully-designed contrastive learning objective to obtain preference from LMs. We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing. Extensive experiments demonstrate not only the state-of-the-art performance but also the transferability and compositionality of CEIL, shedding new light on effective and efficient in-context learning. Our code is released at https://github.com/HKUNLP/icl-ceil.
翻译:大型预训练语言模型已展现出惊人的上下文学习能力,即通过包含输入-输出示例的提示作为演示来学习执行未见任务,且无需任何参数更新。上下文学习的性能高度依赖于所选上下文示例的质量。然而,以往的示例选择方法多基于简单启发式规则,导致性能欠佳。本研究将上下文示例选择形式化为子集选择问题,提出CEIL(用于上下文学习的组合式示例)方法,该方法通过行列式点过程对给定输入与上下文示例间的交互进行建模,并经由精心设计的对比学习目标进行优化,以获取语言模型的偏好。我们在涵盖情感分析、释义检测、自然语言推理、常识推理、开放域问答、代码生成和语义解析等7类自然语言处理任务的12个分类与生成数据集上验证了CEIL的性能。大量实验表明,CEIL不仅具有最先进的性能,还展现出可迁移性与组合性,为高效且有效的上下文学习提供了新思路。我们的代码已开源至https://github.com/HKUNLP/icl-ceil。