We propose a novel framework for exploring generalization errors of transfer learning through the lens of differential calculus on the space of probability measures. In particular, we consider two main transfer learning scenarios, $\alpha$-ERM and fine-tuning with the KL-regularized empirical risk minimization and establish generic conditions under which the generalization error and the population risk convergence rates for these scenarios are studied. Based on our theoretical results, we show the benefits of transfer learning with a one-hidden-layer neural network in the mean-field regime under some suitable integrability and regularity assumptions on the loss and activation functions.
翻译:我们提出了一种新颖的框架,通过概率测度空间上的微分学视角来探索迁移学习的泛化误差。具体而言,我们考虑两种主要的迁移学习场景:$\alpha$-ERM 以及采用 KL 正则化经验风险最小化的微调,并建立了通用条件,用以研究这些场景下的泛化误差与总体风险收敛速率。基于我们的理论结果,在损失函数和激活函数满足适当的可积性与正则性假设下,我们展示了均值场体系中单隐藏层神经网络进行迁移学习的优势。