The performance of Large Language Models (LLMs) on downstream tasks often improves significantly when including examples of the input-label relationship in the context. However, there is currently no consensus about how this in-context learning (ICL) ability of LLMs works: for example, while Xie et al. (2021) liken ICL to a general-purpose learning algorithm, Min et al. (2022b) argue ICL does not even learn label relationships from in-context examples. In this paper, we study (1) how labels of in-context examples affect predictions, (2) how label relationships learned during pre-training interact with input-label examples provided in-context, and (3) how ICL aggregates label information across in-context examples. Our findings suggests LLMs usually incorporate information from in-context labels, but that pre-training and in-context label relationships are treated differently, and that the model does not consider all in-context information equally. Our results give insights into understanding and aligning LLM behavior.
翻译:大语言模型(LLMs)在下游任务中的性能通常在上下文中提供输入-标签关系的示例后显著提升。然而,目前学界对LLMs这种上下文学习(ICL)能力的运作机制尚未达成共识:例如,Xie等人(2021)将ICL类比为通用学习算法,而Min等人(2022b)则主张ICL甚至无法从上下文示例中学习标签关系。本文研究以下三个问题:(1)上下文示例标签如何影响预测结果;(2)预训练阶段习得的标签关系与上下文中提供的输入-标签示例如何交互作用;(3)ICL如何聚合跨上下文示例中的标签信息。研究结果表明,LLMs通常能够整合上下文标签中的信息,但预训练标签关系与上下文标签关系存在差异化处理机制,且模型对上下文信息并非等量齐观。我们的发现为理解和校准LLM行为提供了新的见解。