The existing research on spectral algorithms, applied within a Reproducing Kernel Hilbert Space (RKHS), has primarily focused on general kernel functions, often neglecting the inherent structure of the input feature space. Our paper introduces a new perspective, asserting that input data are situated within a low-dimensional manifold embedded in a higher-dimensional Euclidean space. We study the convergence performance of spectral algorithms in the RKHSs, specifically those generated by the heat kernels, known as diffusion spaces. Incorporating the manifold structure of the input, we employ integral operator techniques to derive tight convergence upper bounds concerning generalized norms, which indicates that the estimators converge to the target function in strong sense, entailing the simultaneous convergence of the function itself and its derivatives. These bounds offer two significant advantages: firstly, they are exclusively contingent on the intrinsic dimension of the input manifolds, thereby providing a more focused analysis. Secondly, they enable the efficient derivation of convergence rates for derivatives of any k-th order, all of which can be accomplished within the ambit of the same spectral algorithms. Furthermore, we establish minimax lower bounds to demonstrate the asymptotic optimality of these conclusions in specific contexts. Our study confirms that the spectral algorithms are practically significant in the broader context of high-dimensional approximation.
翻译:现有关于谱算法的研究,主要基于再生核希尔伯特空间中的一般核函数,往往忽略了输入特征空间的内在结构。本文提出了一种新视角,认为输入数据位于嵌入高维欧氏空间中的低维流形上。我们研究了由热核生成的再生核希尔伯特空间(即扩散空间)中谱算法的收敛性能。结合输入的流形结构,我们采用积分算子技术,推导了关于广义范数的紧致收敛上界,这表明估计量以强意义收敛至目标函数,同时包含了函数本身及其导数的收敛性。这些上界具有两个显著优势:首先,它们仅依赖于输入流形的内在维度,从而提供了更聚焦的分析;其次,它们能够有效导出任意k阶导数的收敛速率,且所有推导均可在同一谱算法框架内完成。此外,我们建立了极小化极大下界,以证明这些结论在特定情境下的渐近最优性。本研究证实了谱算法在高维近似这一更广泛背景下具有实际意义。