SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe

To induce desired behaviors in large language models (LLMs) for interaction-driven tasks, the instruction-tuning stage typically trains LLMs on instruction-response pairs using the next-token prediction (NTP) loss. Previous work aiming to improve instruction-tuning performance often emphasizes the need for higher-quality supervised fine-tuning (SFT) datasets, which typically involves expensive data filtering with proprietary LLMs or labor-intensive data generation by human annotators. However, these approaches do not fully leverage the datasets' intrinsic properties, resulting in high computational and labor costs, thereby limiting scalability and performance gains. In this paper, we propose SFTMix, a novel recipe that elevates instruction-tuning performance beyond the conventional NTP paradigm, without the need for well-curated datasets. Observing that LLMs exhibit uneven confidence across the semantic representation space, we argue that examples with different confidence levels should play distinct roles during the instruction-tuning process. Based on this insight, SFTMix leverages training dynamics to identify examples with varying confidence levels, then applies a Mixup-based regularization to mitigate overfitting on confident examples while propagating supervision signals to improve learning on relatively unconfident ones. This approach enables SFTMix to significantly outperform NTP across a wide range of instruction-following and healthcare domain-specific SFT tasks, demonstrating its adaptability to diverse LLM families and scalability to datasets of any size. Comprehensive ablation studies further verify the robustness of SFTMix's design choices, underscoring its versatility in consistently enhancing performance across different LLMs and datasets in broader natural language processing applications.

翻译：为引导大型语言模型（LLMs）在交互驱动任务中产生期望行为，指令微调阶段通常采用基于指令-响应对的下一个词元预测（NTP）损失进行训练。以往提升指令微调性能的研究多聚焦于构建更高质量的监督微调（SFT）数据集，这通常涉及使用专有LLMs进行昂贵的数据过滤或依赖人工标注者进行劳动密集型数据生成。然而，这些方法未能充分利用数据集的内在特性，导致高昂的计算与人力成本，从而限制了可扩展性与性能提升。本文提出SFTMix——一种创新的训练方案，该方案无需依赖精心策划的数据集即可将指令微调性能提升至超越传统NTP范式的新高度。通过观察到LLMs在语义表示空间中存在置信度分布不均的现象，我们认为不同置信度的样本在指令微调过程中应发挥不同作用。基于此洞见，SFTMix利用训练动态识别具有不同置信度的样本，继而应用基于Mixup的正则化方法：在抑制对高置信度样本过拟合的同时，通过传递监督信号以增强对相对低置信度样本的学习。该方法使SFTMix在广泛的指令跟随任务及医疗领域特定SFT任务中显著超越NTP，并展现出其对不同LLM家族的适应能力以及对任意规模数据集的可扩展性。系统的消融实验进一步验证了SFTMix设计选择的鲁棒性，凸显了其在更广泛自然语言处理应用中持续提升不同LLMs与数据集性能的普适性。