We consider the problem of learning a target function corresponding to a deep, extensive-width, non-linear neural network with random Gaussian weights. We consider the asymptotic limit where the number of samples, the input dimension and the network width are proportionally large. We propose a closed-form expression for the Bayes-optimal test error, for regression and classification tasks. We further compute closed-form expressions for the test errors of ridge regression, kernel and random features regression. We find, in particular, that optimally regularized ridge regression, as well as kernel regression, achieve Bayes-optimal performances, while the logistic loss yields a near-optimal test error for classification. We further show numerically that when the number of samples grows faster than the dimension, ridge and kernel methods become suboptimal, while neural networks achieve test error close to zero from quadratically many samples.
翻译:我们考虑学习一个目标函数的问题,该目标函数对应于一个具有随机高斯权重的深度、宽度扩展的非线性神经网络。我们考虑样本数、输入维度和网络宽度成比例增大的渐近极限。我们提出了回归和分类任务中贝叶斯最优测试误差的闭式表达式。进一步,我们计算了岭回归、核回归和随机特征回归的测试误差的闭式表达式。特别地,我们发现最优正则化的岭回归以及核回归能够达到贝叶斯最优性能,而逻辑损失在分类任务中产生接近最优的测试误差。我们还通过数值实验表明,当样本数比维度增长更快时,岭回归和核方法变得次优,而神经网络在二次方数量级的样本下可实现接近零的测试误差。