Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability, where the model learns to do an unseen task via a prompt consisting of input-output examples as the demonstration, without any parameter updates. The performance of ICL is highly dominated by the quality of the selected in-context examples. However, previous selection methods are mostly based on simple heuristics, leading to sub-optimal performance. In this work, we formulate in-context example selection as a subset selection problem. We propose CEIL (Compositional Exemplars for In-context Learning), which is instantiated by Determinantal Point Processes (DPPs) to model the interaction between the given input and in-context examples, and optimized through a carefully-designed contrastive learning objective to obtain preference from LMs. We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing. Extensive experiments demonstrate not only the state-of-the-art performance but also the transferability and compositionality of CEIL, shedding new light on effective and efficient in-context learning. Our code is released at https://github.com/HKUNLP/icl-ceil.
翻译:大型预训练语言模型展示了令人瞩目的上下文学习能力,即模型能够通过由输入-输出示例构成的提示进行演示,在不更新参数的情况下学习执行未知任务。上下文学习的性能在很大程度上取决于所选上下文示例的质量。然而,以往的示例选择方法大多基于简单的启发式规则,导致性能次优。本文中将上下文示例选择形式化为子集选择问题。我们提出CEIL(上下文学习的组合性示例),该方法通过行列式点过程建模给定输入与上下文示例之间的交互,并设计精心优化的对比学习目标,从语言模型中获取偏好。我们在涵盖7项不同自然语言处理任务的12个分类与生成数据集上验证CEIL,包括情感分析、释义检测、自然语言推理、常识推理、开放域问答、代码生成及语义解析。大量实验不仅证明了CEIL的先进性能,还揭示了其可迁移性与组合性,为高效且有效的上下文学习提供了新思路。我们的代码已发布在https://github.com/HKUNLP/icl-ceil。