We study a fundamental transfer learning process from source to target linear regression tasks, including overparameterized settings where there are more learned parameters than data samples. The target task learning is addressed by using its training data together with the parameters previously computed for the source task. We define a transfer learning approach to the target task as a linear regression optimization with a regularization on the distance between the to-be-learned target parameters and the already-learned source parameters. We analytically characterize the generalization performance of our transfer learning approach and demonstrate its ability to resolve the peak in generalization errors in double descent phenomena of the minimum L2-norm solution to linear regression. Moreover, we show that for sufficiently related tasks, the optimally tuned transfer learning approach can outperform the optimally tuned ridge regression method, even when the true parameter vector conforms to an isotropic Gaussian prior distribution. Namely, we demonstrate that transfer learning can beat the minimum mean square error (MMSE) solution of the independent target task. Our results emphasize the ability of transfer learning to extend the solution space to the target task and, by that, to have an improved MMSE solution. We formulate the linear MMSE solution to our transfer learning setting and point out its key differences from the common design philosophy to transfer learning.
翻译:我们研究从源线性回归任务到目标线性回归任务的基本迁移学习过程,包括参数数量多于样本数量的过参数化场景。目标任务的学习通过利用其训练数据及源任务预计算参数来实现。我们将针对目标任务的迁移学习方法定义为带有正则项的线性回归优化,该正则项约束待学习的目标参数与已学习的源参数之间的距离。我们解析刻画了该迁移学习方法的泛化性能,证明其能够解决线性回归最小L2范数解中双重下降现象泛化误差的峰值问题。进一步研究表明,对于充分关联的任务,即使真实参数向量服从各向同性高斯先验分布,最优调参后的迁移学习方法仍能超越最优调参的岭回归方法。具体而言,我们证明迁移学习可以击败独立目标任务的最小均方误差(MMSE)解。研究结果强调,迁移学习能够扩展目标任务解空间,从而获得更优的MMSE解。我们给出了迁移学习框架下的线性MMSE解公式,并指出其与常见迁移学习设计理念的关键差异。