Transferability estimation identifies the best pre-trained models for downstream tasks without incurring the high computational cost of full fine-tuning. This capability facilitates deployment and advances the pre-training and fine-tuning paradigm. However, existing methods often struggle to accurately assess transferability for emerging pre-trained models with diverse architectures, training strategies, and task alignments. In this work, we propose Implicit Transferability Modeling (ITM), a novel framework that implicitly models each model's intrinsic transferability, coupled with a Divide-and-Conquer Variational Approximation (DVA) strategy to efficiently approximate embedding space evolution. This design enables generalization across a broader range of models and downstream tasks. Extensive experiments on a comprehensive benchmark--spanning extensive training regimes and a wider variety of model types--demonstrate that ITM consistently outperforms existing methods in terms of stability, effectiveness, and efficiency.
翻译:可迁移性评估旨在无需承担完整微调的高昂计算成本,即可为下游任务识别最佳预训练模型。这一能力促进了模型部署,并推动了预训练与微调范式的发展。然而,现有方法往往难以准确评估具有多样化架构、训练策略和任务对齐方式的新型预训练模型的可迁移性。在本工作中,我们提出了隐式可迁移性建模(ITM),这是一个新颖的框架,能够隐式地建模每个模型固有的可迁移性,并结合一种分治变分近似(DVA)策略来高效逼近嵌入空间的演化。该设计使得框架能够在更广泛的模型和下游任务上实现泛化。在一个涵盖广泛训练机制和更多样化模型类型的综合性基准上进行的广泛实验表明,ITM在稳定性、有效性和效率方面均持续优于现有方法。