Scale does not uniformly improve reasoning - it restructures it. Analyzing 25,000+ chain-of-thought trajectories across four domains (Law, Science, Code, Math) and two scales (8B, 70B parameters), we discover that neural scaling laws trigger domain-specific phase transitions rather than uniform capability gains. Legal reasoning undergoes Crystallization: 45% collapse in representational dimensionality (d95: 501 -> 274), 31% increase in trajectory alignment, and 10x manifold untangling. Scientific and mathematical reasoning remain Liquid - geometrically invariant despite 9x parameter increase. Code reasoning forms a discrete Lattice of strategic modes (silhouette: 0.13 -> 0.42). This geometry predicts learnability. We introduce Neural Reasoning Operators - learned mappings from initial to terminal hidden states. In crystalline legal reasoning, our operator achieves 63.6% accuracy on held-out tasks via probe decoding, predicting reasoning endpoints without traversing intermediate states. We further identify a universal oscillatory signature (coherence ~ -0.4) invariant across domains and scales, suggesting attention and feedforward layers drive reasoning through opposing dynamics. These findings establish that the cost of thought is determined not by task difficulty but by manifold geometry - offering a blueprint for inference acceleration where topology permits.
翻译:规模并非均匀提升推理能力——而是重构了推理过程。通过分析四个领域(法律、科学、代码、数学)和两种规模(80亿、700亿参数)下的超过25,000条思维链轨迹,我们发现神经缩放定律引发了领域特定的相变,而非均匀的能力提升。法律推理经历结晶化:表征维度降低45%(d95:501→274),轨迹对齐度提升31%,流形解缠度提高10倍。科学与数学推理保持液态——尽管参数增加9倍,其几何结构保持不变。代码推理形成离散的策略模式晶格(轮廓系数:0.13→0.42)。这种几何结构可预测可学习性。我们提出神经推理算子——从初始隐藏状态到终止隐藏状态的学习映射。在结晶化的法律推理中,我们的算子通过探针解码在保留任务上达到63.6%的准确率,无需遍历中间状态即可预测推理终点。我们进一步发现跨领域和规模普遍存在的振荡特征(相干度约-0.4),表明注意力层与前馈层通过对抗动力学驱动推理。这些发现证明思维的成本并非由任务难度决定,而是由流形几何决定——为拓扑结构允许的推理加速提供了蓝图。