Instruction learning of Large Language Models (LLMs) has enabled zero-shot task generalization. However, instruction learning has been predominantly approached as a fine-tuning problem, including instruction tuning and reinforcement learning from human feedback, where LLMs are multi-task fine-tuned on various tasks with instructions. In this paper, we present a surprising finding that applying in-context learning to instruction learning, referred to as In-Context Instruction Learning (ICIL), significantly improves the zero-shot task generalization performance for both pretrained and instruction-fine-tuned models. One of the core advantages of ICIL is that it uses a single fixed prompt to evaluate all tasks, which is a concatenation of cross-task demonstrations. In particular, we demonstrate that the most powerful instruction-fine-tuned baseline (text-davinci-003) also benefits from ICIL by 9.3%, indicating that the effect of ICIL is complementary to instruction-based fine-tuning.
翻译:大语言模型(LLMs)的指令学习已实现零样本任务泛化。然而,指令学习主要被作为微调问题来处理,包括指令微调和基于人类反馈的强化学习,其中LLMs通过指令在各种任务上进行多任务微调。本文呈现了一个令人惊讶的发现:将上下文学习应用于指令学习(称为上下文指令学习,ICIL)能够显著提升预训练模型和指令微调模型的零样本任务泛化性能。ICIL的核心优势之一在于使用单个固定提示词评估所有任务,该提示词是跨任务示例的拼接。特别地,我们证明最强大的指令微调基线(text-davinci-003)也能通过ICIL受益9.3%,表明ICIL的效果与基于指令的微调具有互补性。