Selecting an appropriate kernel is a central challenge in kernel-based spectral methods. In \emph{Kernelized Diffusion Maps} (KDM), the kernel determines the accuracy of the RKHS estimator of a diffusion-type operator and hence the quality and stability of the recovered eigenfunctions. We introduce two complementary approaches to adaptive kernel selection for KDM. First, we develop a variational outer loop that learns continuous kernel parameters, including bandwidths and mixture weights, by differentiating through the Cholesky-reduced KDM eigenproblem with an objective combining eigenvalue maximization, subspace orthonormality, and RKHS regularization. Second, we propose an unsupervised cross-validation pipeline that selects kernel families and bandwidths using an eigenvalue-sum criterion together with random Fourier features for scalability. Both methods share a common theoretical foundation: we prove Lipschitz dependence of KDM operators on kernel weights, continuity of spectral projectors under a gap condition, a residual-control theorem certifying proximity to the target eigenspace, and exponential consistency of the cross-validation selector over a finite kernel dictionary.
翻译:选择合适的核函数是基于核的谱方法中的核心挑战。在核化扩散映射中,核函数决定了扩散型算子的再生核希尔伯特空间估计器的精度,进而影响恢复特征函数的质量与稳定性。我们针对核化扩散映射提出两种互补的自适应核选择方法。首先,我们开发了一种变分外循环,通过利用乔列斯基降阶的核化扩散映射特征问题(其目标函数结合了特征值最大化、子空间正交性以及再生核希尔伯特空间正则化)的微分,学习连续核参数,包括带宽与混合权重。其次,我们提出了一种无监督交叉验证流程,该流程利用特征值和准则结合随机傅里叶特征实现可扩展性,从而选择核族与带宽。两种方法共享同一理论基础:我们证明了核化扩散映射算子对核权重的利普希茨依赖性、谱投影算子在间隙条件下的连续性、保证接近目标特征空间的残差控制定理,以及交叉验证选择器在有限核字典上的指数一致性。