Transfer learning is a critical part of real-world machine learning deployments and has been extensively studied in experimental works with overparameterized neural networks. However, even in the simplest setting of linear regression a notable gap still exists in the theoretical understanding of transfer learning. In-distribution research on high-dimensional linear regression has led to the identification of a phenomenon known as \textit{benign overfitting}, in which linear interpolators overfit to noisy training labels and yet still generalize well. This behavior occurs under specific conditions on the source covariance matrix and input data dimension. Therefore, it is natural to wonder how such high-dimensional linear models behave under transfer learning. We prove the first non-asymptotic excess risk bounds for benignly-overfit linear interpolators in the transfer learning setting. From our analysis, we propose a taxonomy of \textit{beneficial} and \textit{malignant} covariate shifts based on the degree of overparameterization. We follow our analysis with empirical studies that show these beneficial and malignant covariate shifts for linear interpolators on real image data, and for fully-connected neural networks in settings where the input data dimension is larger than the training sample size.
翻译:迁移学习是现实世界机器学习部署中的关键组成部分,并在过参数化神经网络的实验研究中得到了广泛探索。然而,即使在线性回归这一最简设定下,迁移学习的理论理解仍存在显著空白。针对高维线性回归的分布内研究揭示了一种称为"良性过拟合"的现象:线性插值器虽然对含噪训练标签过拟合,却仍能取得良好的泛化性能。该行为出现在源协方差矩阵和输入数据维度满足特定条件时。因此,自然引发思考:此类高维线性模型在迁移学习场景中会呈现何种行为?我们首次证明了迁移学习环境下良性过拟合线性插值器的非渐近超额风险界。基于分析,我们根据过参数化程度提出协变量偏移的"有益型"与"有害型"分类体系。随后通过实证研究展示了这些有益与有害协变量偏移:在真实图像数据上验证了线性插值器的表现,并在输入数据维度大于训练样本量的设定下验证了全连接神经网络的行为。