We study a fundamental transfer learning process from source to target linear regression tasks, including overparameterized settings where there are more learned parameters than data samples. The target task learning is addressed by using its training data together with the parameters previously computed for the source task. We define a transfer learning approach to the target task as a linear regression optimization with a regularization on the distance between the to-be-learned target parameters and the already-learned source parameters. We analytically characterize the generalization performance of our transfer learning approach and demonstrate its ability to resolve the peak in generalization errors in double descent phenomena of the minimum L2-norm solution to linear regression. Moreover, we show that for sufficiently related tasks, the optimally tuned transfer learning approach can outperform the optimally tuned ridge regression method, even when the true parameter vector conforms to an isotropic Gaussian prior distribution. Namely, we demonstrate that transfer learning can beat the minimum mean square error (MMSE) solution of the independent target task. Our results emphasize the ability of transfer learning to extend the solution space to the target task and, by that, to have an improved MMSE solution. We formulate the linear MMSE solution to our transfer learning setting and point out its key differences from the common design philosophy to transfer learning.
翻译:我们研究从源线性回归任务到目标线性回归任务的基本迁移学习过程,包括过参数化场景(即学习参数数量多于数据样本的情况)。目标任务的学习通过结合其自身训练数据与先前为源任务计算得到的参数来实现。我们将针对目标任务的迁移学习方法定义为一种线性回归优化问题,其中对即将学习的目标参数与已学习的源参数之间的距离进行正则化约束。我们通过解析方法刻画了该迁移学习方法的泛化性能,并证明其能够解决线性回归最小L2范数解在双下降现象中出现的泛化误差峰值问题。此外,我们证明对于充分相关的任务,即使真实参数向量符合各向同性高斯先验分布,经过最优调参的迁移学习方法仍可优于最优调参的岭回归方法。也就是说,我们证明了迁移学习能够超越独立目标任务的最小均方误差(MMSE)解。我们的研究结果强调了迁移学习通过扩展目标任务的解空间从而获得改进的MMSE解的能力。我们推导了该迁移学习场景下的线性MMSE解,并指出其与常见迁移学习设计理念的关键差异。