We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^{\gamma}$ for some $\gamma >0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metric entropy $\bar{\varepsilon}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $\mathbb{S}^{d}$, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{-1/2}$ when $n\asymp d^{\gamma}$ for $\gamma =2, 4, 6, 8, \cdots$. We then further determine the optimal rate of the excess risk of kernel regression for all the $\gamma>0$ and find that the curve of optimal rate varying along $\gamma$ exhibits several new phenomena including the multiple descent behavior and the periodic plateau behavior. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.
翻译:我们对高维数据(其中样本量$n$与样本维度$d$呈多项式依赖关系,即$n\asymp d^{\gamma}$,其中$\gamma >0$)的核回归进行了研究。我们首先构建了一个通用工具,分别通过门德尔松复杂度$\varepsilon_{n}^{2}$和度量熵$\bar{\varepsilon}_{n}^{2}$来刻画高维数据核回归的上界与极小极大下界。当目标函数落入定义在$\mathbb{S}^{d}$上的(广义)内积模型所对应的再生核希尔伯特空间时,我们利用这一新工具证明:当$n\asymp d^{\gamma}$且$\gamma =2, 4, 6, 8, \cdots$时,核回归超额风险的极小最优速率是$n^{-1/2}$。随后,我们进一步确定了所有$\gamma>0$情形下核回归超额风险的最优速率,并发现最优速率随$\gamma$变化的曲线呈现出包括多重下降现象与周期性平台现象在内的若干新特征。作为应用,对于神经正切核,我们也给出了最优速率曲线的类似显式描述。一个直接推论是,这些结论同样适用于宽神经网络。