We investigate double descent and scaling laws in terms of weights rather than the number of parameters. Specifically, we analyze linear and random features models using the deterministic equivalence approach from random matrix theory. We precisely characterize how the weights norm concentrate around deterministic quantities and elucidate the relationship between the expected test error and the norm-based capacity (complexity). Our results rigorously answer whether double descent exists under norm-based capacity and reshape the corresponding scaling laws. Moreover, they prompt a rethinking of the data-parameter paradigm - from under-parameterized to over-parameterized regimes - by shifting the focus to norms (weights) rather than parameter count.
翻译:本研究从权重而非参数数量的角度探究双下降现象与缩放定律。具体而言,我们运用随机矩阵理论中的确定性等价方法,分析了线性模型与随机特征模型。我们精确刻画了权重范数如何围绕确定性量集中,并阐明了期望测试误差与基于范数的容量(复杂度)之间的关系。我们的结果严格回答了基于范数容量条件下双下降现象是否存在,并重塑了相应的缩放定律。此外,这些发现促使我们重新思考从欠参数化到过参数化机制的数据-参数范式——将关注点从参数数量转向范数(权重)。