Transfer learning techniques aim to leverage information from multiple related datasets to enhance prediction quality against a target dataset. Such methods have been adopted in the context of high-dimensional sparse regression, and some Lasso-based algorithms have been invented: Trans-Lasso and Pretraining Lasso are such examples. These algorithms require the statistician to select hyperparameters that control the extent and type of information transfer from related datasets. However, selection strategies for these hyperparameters, as well as the impact of these choices on the algorithm's performance, have been largely unexplored. To address this, we conduct a thorough, precise study of the algorithm in a high-dimensional setting via an asymptotic analysis using the replica method. Our approach reveals a surprisingly simple behavior of the algorithm: Ignoring one of the two types of information transferred to the fine-tuning stage has little effect on generalization performance, implying that efforts for hyperparameter selection can be significantly reduced. Our theoretical findings are also empirically supported by real-world applications on the IMDb dataset.
翻译:迁移学习技术旨在利用多个相关数据集的信息,以提升针对目标数据集的预测质量。此类方法已被应用于高维稀疏回归场景,并催生了一些基于Lasso的算法:Trans-Lasso与Pretraining Lasso即为典型示例。这些算法要求统计学家选择控制从相关数据集进行信息迁移的程度与类型的超参数。然而,这些超参数的选择策略及其对算法性能的影响,在很大程度上尚未得到充分探索。为此,我们通过采用复本法的渐近分析,在高维设定下对算法进行了全面而精确的研究。我们的方法揭示了算法一个令人惊讶的简单行为:忽略在微调阶段传递的两类信息中的一类,对泛化性能影响甚微,这意味着超参数选择的努力可被显著简化。我们的理论发现亦通过IMDb数据集上的实际应用得到了实证支持。