The application of transfer learning, an approach utilizing knowledge from a source domain to enhance model performance in a target domain, has seen a tremendous rise in recent years, underpinning many real-world scenarios. The key to its success lies in the shared common knowledge between the domains, a prerequisite in most transfer learning methodologies. These methods typically presuppose identical feature spaces and label spaces in both domains, known as homogeneous transfer learning, which, however, is not always a practical assumption. Oftentimes, the source and target domains vary in feature spaces, data distributions, and label spaces, making it challenging or costly to secure source domain data with identical feature and label spaces as the target domain. Arbitrary elimination of these differences is not always feasible or optimal. Thus, heterogeneous transfer learning, acknowledging and dealing with such disparities, has emerged as a promising approach for a variety of tasks. Despite the existence of a survey in 2017 on this topic, the fast-paced advances post-2017 necessitate an updated, in-depth review. We therefore present a comprehensive survey of recent developments in heterogeneous transfer learning methods, offering a systematic guide for future research. Our paper reviews methodologies for diverse learning scenarios, discusses the limitations of current studies, and covers various application contexts, including Natural Language Processing, Computer Vision, Multimodality, and Biomedicine, to foster a deeper understanding and spur future research.
翻译:迁移学习作为一种利用源域知识提升目标域模型性能的方法,近年来在众多实际场景中得到了广泛应用并取得了显著成效。其成功的关键在于域间共有的共同知识,这是大多数迁移学习方法的前提条件。这类方法通常假设源域和目标域具有相同的特征空间和标签空间,即同构迁移学习,然而这一假设在实际中往往难以成立。源域和目标域在特征空间、数据分布及标签空间上常常存在差异,这使得获取与目标域具有相同特征空间和标签空间的源域数据变得困难且成本高昂。简单地消除这些差异并非总是可行或最优。因此,能够识别并处理此类差异的异构迁移学习,已成为解决多种任务的有效途径。尽管2017年已有关于该主题的综述,但2017年后该领域的快速发展亟需更新且深入的回顾。为此,本文全面综述了异构迁移学习方法的最新进展,为未来研究提供系统性指导。我们回顾了适用于不同学习场景的方法,讨论了现有研究的局限性,并涵盖了自然语言处理、计算机视觉、多模态及生物医学等多种应用场景,旨在促进对该领域的深入理解并推动未来研究发展。