Kernel ridge regression (KRR) is a foundational tool in machine learning, with recent work emphasizing its connections to neural networks. However, existing theory primarily addresses the i.i.d. setting, while real-world data often exhibits structured dependencies - particularly in applications like denoising score learning where multiple noisy observations derive from shared underlying signals. We present the first systematic study of KRR generalization for non-i.i.d. data with signal-noise causal structure, where observations represent different noisy views of common signals. By developing a novel blockwise decomposition method that enables precise concentration analysis for dependent data, we derive excess risk bounds for KRR that explicitly depend on: (1) the kernel spectrum, (2) causal structure parameters, and (3) sampling mechanisms (including relative sample sizes for signals and noises). We further apply our results to denoising score learning, establishing generalization guarantees and providing principled guidance for sampling noisy data points. This work advances KRR theory while providing practical tools for analyzing dependent data in modern machine learning applications.
翻译:核岭回归(Kernel Ridge Regression, KRR)是机器学习中的基础工具,近期研究强调了其与神经网络的联系。然而,现有理论主要针对独立同分布(i.i.d.)场景,而现实世界数据常呈现结构化依赖性——尤其是在去噪分数学习等应用中,其中多个含噪观测源自共享的潜在信号。本文首次系统研究了具有信号-噪声因果结构的非独立同分布数据下KRR的泛化性能,其中观测数据代表共同信号的不同含噪视图。通过开发一种新颖的分块分解方法,实现对依赖数据的精确集中性分析,我们推导出KRR的超额风险界,该风险界明确依赖于:(1)核谱特性;(2)因果结构参数;(3)采样机制(包括信号与噪声的相对样本量)。我们进一步将结果应用于去噪分数学习,建立泛化保证并为含噪数据点的采样提供理论指导。本研究推进了KRR理论,同时为分析现代机器学习应用中的依赖数据提供了实用工具。