Increasing concerns for data privacy and other difficulties associated with retrieving source data for model training have created the need for source-free transfer learning, in which one only has access to pre-trained models instead of data from the original source domains. This setting introduces many challenges, as many existing transfer learning methods typically rely on access to source data, which limits their direct applicability to scenarios where source data is unavailable. Further, practical concerns make it more difficult, for instance efficiently selecting models for transfer without information on source data, and transferring without full access to the source models. So motivated, we propose a model recycling framework for parameter-efficient training of models that identifies subsets of related source models to reuse in both white-box and black-box settings. Consequently, our framework makes it possible for Model as a Service (MaaS) providers to build libraries of efficient pre-trained models, thus creating an opportunity for multi-source data-free supervised transfer learning.
翻译:随着数据隐私问题的日益凸显以及其他与获取源数据以训练模型相关的困难,无源迁移学习的需求应运而生。在此设置中,用户仅能访问预训练模型,而无法获取原始源域的数据。这一场景引入了诸多挑战,因为许多现有迁移学习方法通常依赖于对源数据的访问,这限制了它们在源数据不可用情况下的直接适用性。此外,实际应用中的难题进一步增加了复杂度,例如在没有源数据信息的情况下高效选择待迁移模型,以及在无法完全访问源模型的情况下进行迁移。受此启发,我们提出了一种用于模型参数高效训练的模型回收框架,该框架能够识别相关源模型的子集,并在白盒和黑盒设置中重用这些模型。因此,我们的框架使得模型即服务(Model as a Service, MaaS)提供商能够构建高效的预训练模型库,从而为多源无数据监督迁移学习创造了可能性。