Large language models are able to learn new tasks in context, where they are provided with instructions and a few annotated examples. However, the effectiveness of in-context learning is dependent to the provided context, and the performance on a downstream task can vary a lot depending on the instruction. Importantly, such dependency on the context can happen in unpredictable ways, e.g., a seemingly more informative instruction might lead to a worse performance. In this paper, we propose an alternative approach, which we term in-context probing. Similar to in-context learning, we contextualize the representation of the input with an instruction, but instead of decoding the output prediction, we probe the contextualized representation to predict the label. Through a series of experiments on a diverse set of classification tasks, we show that in-context probing is significantly more robust to changes in instructions. We further show that probing can be particularly helpful to build classifiers on top of smaller models, and with only a hundred training examples.
翻译:大型语言模型能够通过上下文学习新任务,即在提供指令和少量标注示例的情况下执行任务。然而,上下文学习的有效性高度依赖于所提供的上下文,下游任务的性能可能因指令不同而产生显著差异。重要的是,这种对上下文的依赖可能以不可预测的方式发生,例如,看似信息更丰富的指令反而可能导致更差的性能。在本文中,我们提出一种替代方法,称为上下文探针。与上下文学习类似,我们通过指令对输入表示进行上下文化,但并非解码输出预测,而是对上下文化表示进行探针以预测标签。通过在多种分类任务上的系列实验,我们证明上下文探针对指令变化具有更强的鲁棒性。我们进一步表明,探针尤其有助于基于较小模型构建分类器,且仅需一百个训练样本。