As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves that characterize how the prediction error depends on the number of samples is restricted to either large-sample asymptotics ($m\to\infty$) or, for certain simple data distributions, to the high-dimensional asymptotics in which the number of samples scales linearly with the dimension ($m\propto d$). There is a wide gulf between these two regimes, including all higher-order scaling relations $m\propto d^r$, which are the subject of the present paper. We focus on the problem of kernel ridge regression for dot-product kernels and present precise formulas for the test error, bias, and variance, for data drawn uniformly from the sphere in the $r$th-order asymptotic scaling regime $m\to\infty$ with $m/d^r$ held constant. We observe a peak in the learning curve whenever $m \approx d^r/r!$ for any integer $r$, leading to multiple sample-wise descent and nontrivial behavior at multiple scales.
翻译:随着现代机器学习模型不断推进计算前沿,开发不同模型与数据标度模式下预期性能提升的精确估计变得日益重要。目前,描述预测误差如何依赖样本数量的学习曲线的理论理解,要么局限于大样本渐近($m\to\infty$),要么针对某些简单数据分布局限于高维渐近(样本数量与维度呈线性标度,即$m\propto d$)。这两个标度区间之间存在巨大鸿沟,包含所有高阶标度关系$m\propto d^r$——这正是本文的研究对象。我们聚焦于点积核的核岭回归问题,并在$r$阶渐近标度区间$m\to\infty$且$m/d^r$保持恒定的条件下,给出了从球面上均匀抽取数据的测试误差、偏差与方差的精确公式。我们观察到,当$m \approx d^r/r!$(对任意整数$r$)时,学习曲线出现峰值,导致多重样本量下降及多尺度下的非平凡行为。