Neural scaling laws underlie many of the recent advances in deep learning, yet their theoretical understanding remains largely confined to linear models. In this work, we present a systematic analysis of scaling laws for quadratic and diagonal neural networks in the feature learning regime. Leveraging connections with matrix compressed sensing and LASSO, we derive a detailed phase diagram for the scaling exponents of the excess risk as a function of sample complexity and weight decay. This analysis uncovers crossovers between distinct scaling regimes and plateau behaviors, mirroring phenomena widely reported in the empirical neural scaling literature. Furthermore, we establish a precise link between these regimes and the spectral properties of the trained network weights, which we characterize in detail. As a consequence, we provide a theoretical validation of recent empirical observations connecting the emergence of power-law tails in the weight spectrum with network generalization performance, yielding an interpretation from first principles.
翻译:神经缩放定律支撑了深度学习领域的诸多近期进展,然而其理论理解仍主要局限于线性模型。本文对特征学习机制下的二次型与对角神经网络缩放定律进行了系统性分析。通过结合矩阵压缩感知与LASSO方法的关联,我们推导出过风险缩放指数随样本复杂度与权重衰减变化的详细相图。该分析揭示了不同缩放区域间的交叉现象与平台行为,完美复现了神经缩放实证文献中广泛报道的规律。此外,我们建立了这些区域与训练网络权重谱特性之间的精确联系,并对其进行了详细表征。由此,我们为近期关于权重谱幂律尾出现与网络泛化性能关联的实证观察提供了理论验证,从第一性原理出发给出了解释。