Transfer learning has become an essential technique to exploit information from the source domain to boost performance of the target task. Despite the prevalence in high-dimensional data, heterogeneity and heavy tails are insufficiently accounted for by current transfer learning approaches and thus may undermine the resulting performance. We propose a transfer learning procedure in the framework of high-dimensional quantile regression models to accommodate heterogeneity and heavy tails in the source and target domains. We establish error bounds of transfer learning estimator based on delicately selected transferable source domains, showing that lower error bounds can be achieved for critical selection criterion and larger sample size of source tasks. We further propose valid confidence interval and hypothesis test procedures for individual component of high-dimensional quantile regression coefficients by advocating a double transfer learning estimator, which is one-step debiased estimator for the transfer learning estimator wherein the technique of transfer learning is designed again. By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer and identify transferable sources with high probability. Simulation results demonstrate that the proposed method exhibits some favorable and compelling performances and the practical utility is further illustrated by analyzing a real example.
翻译:迁移学习已成为利用源域信息提升目标任务性能的关键技术。尽管该方法在高维数据中应用广泛,但现有迁移学习方法未能充分考虑异质性和重尾特征,这可能导致性能下降。我们提出了一种基于高维分位数回归模型的迁移学习流程,以处理源域与目标域中的异质性和重尾分布。通过精心筛选可迁移的源域,我们建立了迁移学习估计量的误差界,表明在关键选择准则和更大源任务样本量条件下可实现更低的误差界。进一步地,我们通过提出双重迁移学习估计量(一种针对迁移学习估计量的一步去偏估计量,其中重新设计了迁移学习技术),为高维分位数回归系数的单个分量构建了有效的置信区间和假设检验方法。采用数据拆分技术,我们提出了一种可迁移性检测方法,该方法能够高概率避免负迁移并识别可迁移源域。仿真结果表明,所提方法展现出优越且令人信服的性能,并通过实际案例分析进一步验证了其实用价值。