Training stability is typically regarded as a prerequisite for reliable optimization in large language models. In this work, we analyze how stabilizing training dynamics affects the induced generation distribution. We show that under standard maximum likelihood training, stable parameter trajectories lead stationary solutions to approximately minimize the forward KL divergence to the empirical distribution, while implicitly reducing generative entropy. As a consequence, the learned model can concentrate probability mass on a limited subset of empirical modes, exhibiting systematic degeneration despite smooth loss convergence. We empirically validate this effect using a controlled feedback-based training framework that stabilizes internal generation statistics, observing consistent low-entropy outputs and repetitive behavior across architectures and random seeds. It indicates that optimization stability and generative expressivity are not inherently aligned, and that stability alone is an insufficient indicator of generative quality.
翻译:训练稳定性通常被视为大语言模型可靠优化的前提条件。本研究分析了稳定训练动态如何影响诱导的生成分布。我们证明,在标准最大似然训练下,稳定的参数轨迹会使平稳解近似最小化前向KL散度与经验分布之间的差异,同时隐式降低生成熵。其结果是,习得模型可能将概率质量集中在经验模态的有限子集上,尽管损失函数平滑收敛,仍表现出系统性退化现象。我们通过基于反馈的受控训练框架(可稳定内部生成统计量)对此效应进行实证验证,观察到跨架构与随机种子均存在一致的低熵输出及重复性行为。这表明优化稳定性与生成表达能力并非内在一致,且稳定性本身不足以作为生成质量的评判指标。