Instruction tuning plays a crucial role in shaping the outputs of language models (LMs) to desired styles. In this work, we propose a simple yet effective method, Instruction Modelling (IM), which trains LMs by applying a loss function to the instruction and prompt part rather than solely to the output part. Through experiments across 21 diverse benchmarks, we show that, in many scenarios, IM can effectively improve the LM performance on both NLP tasks (e.g., MMLU, TruthfulQA, and HumanEval) and open-ended generation benchmarks (e.g., MT-Bench and AlpacaEval). Remarkably, in the most advantageous case, IM boosts model performance on AlpacaEval 1.0 by over 100%. We identify two key factors influencing the effectiveness of IM: (1) The ratio between instruction length and output length in the training data; and (2) The number of training examples. We observe that IM is especially beneficial when trained on datasets with lengthy instructions paired with brief outputs, or under the Superficial Alignment Hypothesis (SAH) where a small amount of training examples are used for instruction tuning. Further analysis substantiates our hypothesis that the improvement can be attributed to reduced overfitting to instruction tuning datasets. Our work provides practical guidance for instruction tuning LMs, especially in low-resource scenarios.
翻译:指令调优在塑造语言模型输出以符合期望风格方面起着关键作用。本文提出一种简单而有效的方法——指令建模,该方法通过将损失函数应用于指令和提示部分(而非仅输出部分)来训练语言模型。通过在21个多样化基准测试中的实验表明,在许多场景下,指令建模能有效提升语言模型在自然语言处理任务(如MMLU、TruthfulQA和HumanEval)和开放式生成基准(如MT-Bench和AlpacaEval)上的性能。值得注意的是,在最有利的情况下,该方法将模型在AlpacaEval 1.0上的性能提升超过100%。我们发现了影响指令建模有效性的两个关键因素:(1)训练数据中指令长度与输出长度的比例;(2)训练样本数量。我们观察到,当训练数据集包含长指令配短输出时,或在表层对齐假设(即使用少量训练样本进行指令调优)条件下,指令建模尤其有效。进一步分析证实了我们的假设:性能提升可归因于降低了对指令调优数据集的过拟合。本研究为语言模型的指令调优(特别是在低资源场景中)提供了实用指导。