Despite recent advances in transfer learning with multiple source data sets, there still lacks developments for mixture target populations that could be approximated through a composite of the sources due to certain key factors like ethnicity in practice. To address this open problem under distributional shifts of covariates and outcome models as well as the absence of accurate labels on target, we propose a novel approach for distributionally robust transfer learning targeting mixture population. It learns a set of covariate-specific weights to infer the target outcome model with multiple sources, relying on a joint source mixture assumption for the target population. Then our method incorporates a group adversarial learning step to enhance the robustness against moderate violation of the joint mixture assumption. In addition, our framework allows the use of side information like small labeled sample as a guidance to avoid over-conservative results. Statistical convergence and predictive accuracy of our method are quantified through asymptotic studies. Simulation and real-world studies demonstrate the out-performance of our method over existing multi-source and transfer learning approaches.
翻译:尽管近年来多源数据集的迁移学习取得了进展,但在实践中,由于种族等关键因素,对于可通过源数据组合近似表示的混合目标人群,相关方法仍显不足。为应对协变量与结果模型存在分布偏移、且目标域缺乏准确标签这一开放性问题,我们提出了一种面向混合人群的分布鲁棒迁移学习新方法。该方法基于目标人群满足联合源混合假设的前提,通过学习一组协变量特异性权重,利用多源数据推断目标结果模型。进一步地,我们引入群体对抗学习步骤以增强方法对联合混合假设适度违背的鲁棒性。此外,本框架允许利用少量标注样本等辅助信息作为引导,避免结果过于保守。我们通过渐近理论研究量化了方法的统计收敛性与预测精度。仿真与真实数据实验表明,本方法优于现有的多源及迁移学习方法。