Dynamic feature transformation (the rich regime) does not always align with predictive performance (better representation), yet accuracy is often used as a proxy for richness, limiting analysis of their relationship. We propose a computationally efficient, performance-independent metric of richness grounded in the low-rank bias of rich dynamics, which recovers neural collapse as a special case. The metric is empirically more stable than existing alternatives and captures known lazy-torich transitions (e.g., grokking) without relying on accuracy. We further use it to examine how training factors (e.g., learning rate) relate to richness, confirming recognized assumptions and highlighting new observations (e.g., batch normalization promotes rich dynamics). An eigendecomposition-based visualization is also introduced to support interpretability, together providing a diagnostic tool for studying the relationship between training factors, dynamics, and representations.
翻译:动态特征变换(丰富机制)并不总是与预测性能(更好的表示)相一致,然而准确性常被用作丰富性的代理指标,这限制了对二者关系的分析。我们提出一种计算高效、与性能无关的丰富性度量方法,该方法基于丰富动态的低秩偏差理论框架,并将神经坍缩作为特例进行还原。该度量在实证中比现有替代方案更稳定,且无需依赖准确性即可捕捉已知的惰性-丰富转变现象(如顿悟效应)。我们进一步运用该度量探究训练因素(如学习率)与丰富性的关联,既验证了既有假设,也揭示了新的发现(例如批归一化可促进丰富动态)。此外,本文引入了基于特征分解的可视化方法以增强可解释性,共同构成了一套用于研究训练因素、动态特性与表示学习之间关系的诊断工具。