Transfer learning has become an essential technique for utilizing information from source datasets to improve the performance of the target task. However, in the context of high-dimensional data, heterogeneity arises due to heteroscedastic variance or inhomogeneous covariate effects. To solve this problem, this paper proposes a robust transfer learning based on the Huber regression, specifically designed for scenarios where the transferable source data set is known. This method effectively mitigates the impact of data heteroscedasticity, leading to improvements in estimation and prediction accuracy. Moreover, when the transferable source data set is unknown, the paper introduces an efficient detection algorithm to identify informative sources. The effectiveness of the proposed method is proved through numerical simulation and empirical analysis using superconductor data.
翻译:迁移学习已成为利用源数据集信息提升目标任务性能的关键技术。然而,在高维数据背景下,异方差性或协变量效应的非均匀性会导致异质性问题。为解决该问题,本文提出一种基于Huber回归的鲁棒迁移学习方法,专门针对可迁移源数据集已知的场景设计。该方法能有效缓解数据异方差性的影响,从而提升估计与预测精度。此外,当可迁移源数据集未知时,本文提出一种高效的检测算法以识别信息性源数据。通过数值模拟和基于超导体数据的实证分析,验证了所提方法的有效性。