Post-training fundamentally alters the behavior of large language models (LLMs), yet its impact on the internal parameter space remains poorly understood. In this work, we conduct a systematic singular value decomposition (SVD) analysis of principal linear layers in pretrained LLMs, focusing on two widely adopted post-training methods: instruction tuning and long-chain-of-thought (Long-CoT) distillation. Our analysis reveals two consistent and unexpected structural changes:(1) a near-uniform geometric scaling of singular values across layers, which theoretically modulates attention scores; and (2) highly consistent orthogonal transformations are applied to the left and right singular vectors of each matrix. Disrupting this orthogonal consistency leads to catastrophic performance degradation. Based on these findings, we propose a simple yet effective framework that interprets post-training as a reparameterization of fixed subspaces in the pretrained parameter space. Further experiments reveal that singular value scaling behaves as a secondary effect, analogous to a temperature adjustment, whereas the core functional transformation lies in the coordinated rotation of singular vectors. These results challenge the prevailing view of the parameter space in large models as a black box, uncovering the first clear regularities in how parameters evolve during training, and providing a new perspective for deeper investigation into model parameter changes.
翻译:训练后处理从根本上改变了大型语言模型的行为,但其对内部参数空间的影响仍鲜为人知。本研究对预训练大型语言模型中的主要线性层进行了系统的奇异值分解分析,重点关注两种广泛采用的训练后方法:指令微调和长思维链蒸馏。我们的分析揭示了两种一致且出人意料的结构变化:(1) 各层奇异值呈现近乎均匀的几何缩放,理论上这会调节注意力分数;(2) 每个矩阵的左右奇异向量均被施加了高度一致的正交变换。破坏这种正交一致性会导致性能灾难性下降。基于这些发现,我们提出了一个简单而有效的框架,将训练后处理解释为预训练参数空间中固定子空间的重新参数化。进一步实验表明,奇异值缩放作为次要效应发挥作用,类似于温度调节,而核心功能转换在于奇异向量的协同旋转。这些发现挑战了将大模型参数空间视为黑盒的主流观点,首次揭示了训练过程中参数演化的清晰规律,为深入研究模型参数变化提供了新视角。