We study transfer learning in the context of estimating piecewise-constant signals when source data, which may be relevant but disparate, are available in addition to the target data. We initially investigate transfer learning estimators that respectively employ $\ell_1$- and $\ell_0$-penalties for unisource data scenarios and then generalise these estimators to accommodate multisource data. To further reduce estimation errors, especially in scenarios where some sources significantly differ from the target, we introduce an informative source selection algorithm. We then examine these estimators with multisource selection and establish their minimax optimality under specific regularity conditions. It is worth emphasising that, unlike the prevalent narrative in the transfer learning literature that the performance is enhanced through large source sample sizes, our approaches leverage higher observation frequencies and accommodate diverse frequencies across multiple sources. Our theoretical findings are empirically validated through extensive numerical experiments, with the code available online, see https://github.com/chrisfanwang/transferlearning
翻译:我们在分段常数信号估计的背景下研究迁移学习问题,此时除目标数据外,还存在可能相关但存在差异的源数据。首先针对单一源数据场景,分别采用 $\ell_1$- 和 $\ell_0$- 惩罚的迁移学习估计量进行探究,随后将这些估计量推广至多源数据场景。为在部分源与目标存在显著差异的场景下进一步降低估计误差,我们提出一种信息源选择算法。接着,我们考察结合多源选择的估计量,并在特定正则性条件下建立其极小化最优性。值得强调的是,与迁移学习文献中普遍认为通过增大源样本量来提升性能的观点不同,我们的方法利用了更高的观测频率,并能适应多源数据间不同的频率特征。通过大量数值实验验证了理论发现,相关代码已公开发布于 https://github.com/chrisfanwang/transferlearning。