The existing research on spectral algorithms, applied within a Reproducing Kernel Hilbert Space (RKHS), has primarily focused on general kernel functions, often neglecting the inherent structure of the input feature space. Our paper introduces a new perspective, asserting that input data are situated within a low-dimensional manifold embedded in a higher-dimensional Euclidean space. We study the convergence performance of spectral algorithms in the RKHSs, specifically those generated by the heat kernels, known as diffusion spaces. Incorporating the manifold structure of the input, we employ integral operator techniques to derive tight convergence upper bounds concerning generalized norms, which indicates that the estimators converge to the target function in strong sense, entailing the simultaneous convergence of the function itself and its derivatives. These bounds offer two significant advantages: firstly, they are exclusively contingent on the intrinsic dimension of the input manifolds, thereby providing a more focused analysis. Secondly, they enable the efficient derivation of convergence rates for derivatives of any k-th order, all of which can be accomplished within the ambit of the same spectral algorithms. Furthermore, we establish minimax lower bounds to demonstrate the asymptotic optimality of these conclusions in specific contexts. Our study confirms that the spectral algorithms are practically significant in the broader context of high-dimensional approximation.
翻译:现有关于再生核希尔伯特空间(RKHS)中谱算法的研究主要关注一般核函数,往往忽略了输入特征空间的内在结构。本文提出了一种新视角,认为输入数据位于嵌入高维欧几里得空间的低维流形中。我们研究了RKHS中由热核(即扩散空间)生成的谱算法的收敛性能。结合输入流形结构,我们利用积分算子技术推导了关于广义范数的紧致收敛上界,这表明估计量以强意义收敛到目标函数,同时实现了函数本身及其导数的收敛。这些上界具有两个重要优势:首先,它们仅依赖于输入流形的内在维度,从而提供了更具针对性的分析;其次,它们能够高效导出任意k阶导数的收敛速率,且所有这些均可在相同谱算法的框架内完成。此外,我们建立了极小极大下界,以证明这些结论在特定情境下的渐近最优性。我们的研究证实,谱算法在高维近似这一更广泛的背景下具有重要的实际意义。