We obtain upper bounds for the estimation error of Kernel Ridge Regression (KRR) for all non-negative regularization parameters, offering a geometric perspective on various phenomena in KRR. As applications: 1. We address the multiple descent problem, unifying the proofs of arxiv:1908.10292 and arxiv:1904.12191 for polynomial kernels and we establish multiple descent for the upper bound of estimation error of KRR under sub-Gaussian design and non-asymptotic regimes. 2. For a sub-Gaussian design vector and for non-asymptotic scenario, we prove the Gaussian Equivalent Conjecture. 3. We offer a novel perspective on the linearization of kernel matrices of non-linear kernel, extending it to the power regime for polynomial kernels. 4. Our theory is applicable to data-dependent kernels, providing a convenient and accurate tool for the feature learning regime in deep learning theory. 5. Our theory extends the results in arxiv:2009.14286 under weak moment assumption. Our proof is based on three mathematical tools developed in this paper that can be of independent interest: 1. Dvoretzky-Milman theorem for ellipsoids under (very) weak moment assumptions. 2. Restricted Isomorphic Property in Reproducing Kernel Hilbert Spaces with embedding index conditions. 3. A concentration inequality for finite-degree polynomial kernel functions.
翻译:我们获得了所有非负正则化参数下核岭回归(KRR)估计误差的上界,从而为KRR中的各种现象提供了几何视角。作为应用:1. 我们解决了多重下降问题,统一了arXiv:1908.10292和arXiv:1904.12191关于多项式核的证明,并建立了次高斯设计与非渐近设定下KRR估计误差上界的多重下降性质。2. 针对次高斯设计向量与非渐近场景,我们证明了高斯等效猜想。3. 我们为非线性核的核矩阵线性化提供了新视角,将其拓展至多项式核的幂次机制。4. 我们的理论适用于数据依赖核,为深度学习理论中的特征学习机制提供了便捷且精确的工具。5. 在弱矩假设下,我们的理论拓展了arXiv:2009.14286的结果。我们的证明基于本文提出的三个具有独立价值的数学工具:1. 在(极)弱矩假设下的椭球Dvoretzky-Milman定理;2. 嵌入指标条件下再生核希尔伯特空间中的受限等距性质;3. 有限次多项式核函数的集中不等式。