Random features models play a distinguished role in the theory of deep learning, describing the behavior of neural networks close to their infinite-width limit. In this work, we present a thorough analysis of the generalization performance of random features models for generic supervised learning problems with Gaussian data. Our approach, built with tools from the statistical mechanics of disordered systems, maps the random features model to an equivalent polynomial model, and allows us to plot average generalization curves as functions of the two main control parameters of the problem: the number of random features $N$ and the size $P$ of the training set, both assumed to scale as powers in the input dimension $D$. Our results extend the case of proportional scaling between $N$, $P$ and $D$. They are in accordance with rigorous bounds known for certain particular learning tasks and are in quantitative agreement with numerical experiments performed over many order of magnitudes of $N$ and $P$. We find good agreement also far from the asymptotic limits where $D\to \infty$ and at least one between $P/D^K$, $N/D^L$ remains finite.
翻译:随机特征模型在深度学习理论中扮演着独特角色,它描述了神经网络接近其无限宽度极限时的行为。本文针对高斯数据下的通用监督学习问题,对随机特征模型的泛化性能进行了全面分析。基于无序系统统计力学工具,我们将随机特征模型映射为等价的多项式模型,并能够绘制出作为问题两个主要控制参数——随机特征数$N$和训练集规模$P$(两者均假设按输入维度$D$的幂次缩放)——函数的平均泛化曲线。我们的结果扩展了$N$、$P$与$D$之间比例缩放的情形,与特定学习任务中已知的严格界一致,且与跨越多个数量级的$N$和$P$数值实验结果定量吻合。即便在远离$D\to \infty$且$P/D^K$、$N/D^L$至少一个保持有限的渐近极限时,我们仍观察到良好的一致性。