When concept shifts and sample scarcity are present in the target domain of interest, nonparametric regression learners often struggle to generalize effectively. The technique of transfer learning remedies these issues by leveraging data or pre-trained models from similar source domains. While existing generalization analyses of kernel-based transfer learning typically rely on correctly specified models, we present a transfer learning procedure that is robust against model misspecification while adaptively attaining optimality. To facilitate our analysis and avoid the risk of saturation found in classical misspecified results, we establish a novel result in the misspecified single-task learning setting, showing that spectral algorithms with fixed bandwidth Gaussian kernels can attain minimax convergence rates given the true function is in a Sobolev space, which may be of independent interest. Building on this, we derive the adaptive convergence rates of the excess risk for specifying Gaussian kernels in a prevalent class of hypothesis transfer learning algorithms. Our results are minimax optimal up to logarithmic factors and elucidate the key determinants of transfer efficiency.
翻译:当目标域存在概念偏移和样本稀缺时,非参数回归学习器往往难以有效泛化。迁移学习技术通过利用相似源域的数据或预训练模型来缓解这些问题。尽管现有基于核的迁移学习泛化分析通常依赖于正确设定的模型,我们提出了一种对模型设定错误具有鲁棒性且能自适应达到最优性的迁移学习流程。为推进分析并避免经典设定错误结果中的饱和风险,我们在设定错误的单任务学习场景中建立了一个新颖结论:当真实函数处于Sobolev空间时,采用固定带宽高斯核的谱算法能够达到极小极大收敛速率,该结论本身可能具有独立研究价值。在此基础上,我们推导了在广泛应用的假设迁移学习算法类中指定高斯核时超额风险的自适应收敛速率。我们的结果在忽略对数因子意义下达到极小极大最优,并阐明了迁移效率的关键决定因素。