In-context learning is a new learning paradigm where a language model conditions on a few input-output pairs (demonstrations) and a test input, and directly outputs the prediction. It has been shown highly dependent on the provided demonstrations and thus promotes the research of demonstration retrieval: given a test input, relevant examples are retrieved from the training set to serve as informative demonstrations for in-context learning. While previous works focus on training task-specific retrievers for several tasks separately, these methods are often hard to transfer and scale on various tasks, and separately trained retrievers incur a lot of parameter storage and deployment cost. In this paper, we propose Unified Demonstration Retriever (\textbf{UDR}), a single model to retrieve demonstrations for a wide range of tasks. To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback. Then we propose a multi-task list-wise ranking training framework, with an iterative mining strategy to find high-quality candidates, which can help UDR fully incorporate various tasks' signals. Experiments on 30+ tasks across 13 task families and multiple data domains show that UDR significantly outperforms baselines. Further analyses show the effectiveness of each proposed component and UDR's strong ability in various scenarios including different LMs (1.3B - 175B), unseen datasets, varying demonstration quantities, etc.
翻译:上下文学习是一种新的学习范式,其中语言模型基于少量输入-输出对(演示)和测试输入,直接输出预测。该范式高度依赖于所提供的演示,因此促进了演示检索的研究:针对给定测试输入,从训练集中检索相关示例,作为上下文学习的信息性演示。以往的工作主要集中于分别针对若干任务训练任务特定的检索器,但这些方法往往难以在不同任务间迁移和扩展,且分别训练的检索器会带来大量的参数存储和部署成本。本文提出统一演示检索器(UDR),这是一个能够为广泛任务检索演示的单一模型。为训练UDR,我们通过语言模型的反馈将各种任务的训练信号转化为统一的列表排序形式。随后,我们提出一种多任务列表排序训练框架,结合迭代挖掘策略以寻找高质量候选,这有助于UDR充分整合多种任务的信号。在涵盖13个任务族和多个数据域的30余项任务上的实验表明,UDR显著优于基线方法。进一步的分析展示了各提出组件的有效性,以及UDR在不同语言模型(1.3B-175B)、未见数据集、不同演示数量等多样化场景中的强大能力。