In the realm of large language models (LLMs), enhancing instruction-following capability often involves curating expansive training data. This is achieved through two primary schemes: i) Scaling-Inputs: Amplifying (input, output) pairs per task instruction, aiming for better instruction adherence. ii) Scaling Input-Free Tasks: Enlarging tasks, each composed of an (instruction, output) pair (without requiring a separate input anymore). However, LLMs under Scaling-Inputs tend to be overly sensitive to inputs, leading to misinterpretation or non-compliance with instructions. Conversely, Scaling Input-Free Tasks demands a substantial number of tasks but is less effective in instruction following when dealing with instances in Scaling-Inputs. This work introduces MUFFIN, a new scheme of instruction-following dataset curation. Specifically, we automatically Scale Tasks per Input by diversifying these tasks with various input facets. Experimental results across four zero-shot benchmarks, spanning both Scaling-Inputs and Scaling Input-Free Tasks schemes, reveal that LLMs, at various scales, trained on MUFFIN generally demonstrate superior instruction-following capabilities compared to those trained on the two aforementioned schemes.
翻译:在大语言模型(LLMs)领域,提升指令跟随能力通常涉及策展大规模训练数据。这主要通过两种方案实现:i) 缩放输入(Scaling-Inputs):为每条任务指令扩展(输入,输出)对,旨在增强指令遵从性。ii) 缩放无输入任务(Scaling Input-Free Tasks):增加任务数量,每个任务由(指令,输出)对组成(不再需要单独的输入)。然而,采用缩放输入方案的LLMs往往对输入过度敏感,导致指令误解或不遵从。相反,缩放无输入任务需要大量任务,但在处理缩放输入方案中的实例时,指令跟随效果较差。本研究提出MUFFIN,一种新的指令跟随数据集策展方案。具体而言,我们通过为每个输入任务引入多样化的输入面向,自动扩展任务。在涵盖缩放输入和缩放无输入任务两种方案的四个零样本基准上的实验结果表明,不同规模的LLMs在使用MUFFIN训练后,其指令跟随能力普遍优于上述两种方案训练得到的模型。