Stronger Models are NOT Stronger Teachers for Instruction Tuning

Instruction tuning has been widely adopted to ensure large language models (LLMs) follow user instructions effectively. The resulting instruction-following capabilities of LLMs heavily rely on the instruction datasets used for tuning. Recently, synthetic instruction datasets have emerged as an economically viable solution to provide LLMs diverse and high-quality instructions. However, existing approaches typically assume that larger or stronger models are stronger teachers for instruction tuning, and hence simply adopt these models as response generators to the synthetic instructions. In this paper, we challenge this commonly-adopted assumption. Our extensive experiments across five base models and twenty response generators reveal that larger and stronger models are not necessarily stronger teachers of smaller models. We refer to this phenomenon as the Larger Models' Paradox. We observe that existing metrics cannot precisely predict the effectiveness of response generators since they ignore the compatibility between teachers and base models being fine-tuned. We thus develop a novel metric, named as Compatibility-Adjusted Reward (CAR) to measure the effectiveness of response generators. Our experiments across five base models demonstrate that CAR outperforms almost all baselines.

翻译：指令微调已被广泛采用，以确保大语言模型（LLMs）能有效遵循用户指令。LLMs由此获得的指令遵循能力在很大程度上依赖于微调所用的指令数据集。最近，合成指令数据集作为一种经济可行的解决方案出现，旨在为LLMs提供多样且高质量的指令。然而，现有方法通常假设更大或更强的模型是指令微调中更强的教师，因此简单地采用这些模型作为合成指令的响应生成器。在本文中，我们挑战了这一普遍采用的假设。我们在五个基础模型和二十个响应生成器上进行的大量实验表明，更大更强的模型并不一定是较小模型的更强教师。我们将此现象称为“更大模型悖论”。我们观察到，现有指标无法精确预测响应生成器的有效性，因为它们忽略了教师与待微调的基础模型之间的兼容性。因此，我们开发了一种名为“兼容性调整奖励”（Compatibility-Adjusted Reward, CAR）的新指标来衡量响应生成器的有效性。我们在五个基础模型上的实验表明，CAR的性能优于几乎所有基线方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日