In-context learning (ICL) is a few-shot learning paradigm that involves learning mappings through input-output pairs and appropriately applying them to new instances. Despite the remarkable ICL capabilities demonstrated by Large Language Models (LLMs), existing works are highly dependent on large-scale labeled support sets, not always feasible in practical scenarios. To refine this approach, we focus primarily on an innovative selective annotation mechanism, which precedes the standard demonstration retrieval. We introduce the Language Model-based Determinant Point Process (LM-DPP) that simultaneously considers the uncertainty and diversity of unlabeled instances for optimal selection. Consequently, this yields a subset for annotation that strikes a trade-off between the two factors. We apply LM-DPP to various language models, including GPT-J, LlaMA, and GPT-3. Experimental results on 9 NLU and 2 Generation datasets demonstrate that LM-DPP can effectively select canonical examples. Further analysis reveals that LLMs benefit most significantly from subsets that are both low uncertainty and high diversity.
翻译:上下文学习(ICL)是一种少样本学习范式,它通过输入-输出对学习映射关系,并将其恰当地应用于新实例。尽管大型语言模型(LLM)已展现出卓越的ICL能力,现有工作高度依赖于大规模标注支持集,这在实际场景中并非总是可行。为了改进这一方法,我们主要关注一种创新的选择性标注机制,该机制先于标准的演示检索步骤。我们提出了基于语言模型的确定性点过程(LM-DPP),它同时考虑未标注实例的不确定性和多样性以进行最优选择。因此,这产生了一个用于标注的子集,该子集在不确定性和多样性之间取得了平衡。我们将LM-DPP应用于多种语言模型,包括GPT-J、LlaMA和GPT-3。在9个自然语言理解(NLU)和2个文本生成数据集上的实验结果表明,LM-DPP能够有效选择典型示例。进一步的分析揭示,LLM从同时具备低不确定性和高多样性的子集中获益最为显著。