Self-distillation (SD) offers a promising path for adapting large language models (LLMs) without relying on stronger external teachers. However, SD in autoregressive LLMs remains challenging because self-generated trajectories are free-form, correctness is task-dependent, and plausible rationales can still provide unstable or unreliable supervision. Existing methods mainly examine isolated design choices, leaving their effectiveness, roles, and interactions unclear. In this paper, we propose UniSD, a unified framework to systematically study self-distillation. UniSD integrates complementary mechanisms that address supervision reliability, representation alignment, and training stability, including multi-teacher agreement, EMA teacher stabilization, token-level contrastive learning, feature matching, and divergence clipping. Across six benchmarks and six models from three model families, UniSD reveals when self-distillation improves over static imitation, which components drive the gains, and how these components interact across tasks. Guided by these insights, we construct UniSDfull, an integrated pipeline that combines complementary components and achieves the strongest overall performance, improving over the base model by +5.4 points and the strongest baseline by +2.8 points. Extensive evaluation highlights self-distillation as a practical and steerable approach for efficient LLM adaptation without stronger external teachers.
翻译:自蒸馏(SD)为在无需依赖更强外部教师模型的情况下适配大型语言模型(LLMs)提供了一条有前景的路径。然而,自回归LLM中的自蒸馏仍面临挑战,因为自生成轨迹是自由形式的,正确性依赖于具体任务,且看似合理的推理依据仍可能提供不稳定或不可靠的监督信号。现有方法主要考察孤立的策略选择,其有效性、作用机制及相互影响尚不明确。本文提出UniSD——一个统一框架,用于系统研究自蒸馏。UniSD整合了多种互补机制,以应对监督可靠性、表征对齐与训练稳定性问题,包括多教师共识机制、指数移动平均(EMA)教师稳定化、词元级对比学习、特征匹配及散度裁剪。在涉及三个模型家族的六个基准测试与六个模型上的实验表明:UniSD揭示了自蒸馏何时优于静态模仿、哪些组件驱动性能提升,以及这些组件在不同任务间如何交互。基于这些洞见,我们构建了UniSDfull——一个集成化流水线,其融合互补组件并实现最强整体性能:相较基模型提升+5.4个点,相较最强基准模型提升+2.8个点。大量评估凸显了自蒸馏作为无需更强外部教师即可高效适配LLM的实用且可控方法的潜力。