Transfer learning has become a powerful tool to initialize deep learning models to achieve faster convergence and higher performance. This is especially useful in the medical imaging analysis domain, where data scarcity limits possible performance gains for deep learning models. Some advancements have been made in boosting the transfer learning performance gain by merging models starting from the same initialization. However, in the medical imaging analysis domain, there is an opportunity in merging models starting from different initialisations, thus combining the features learnt from different tasks. In this work, we propose MedMerge, a method whereby the weights of different models can be merged, and their features can be effectively utilized to boost performance on a new task. With MedMerge, we learn kernel-level weights that can later be used to merge the models into a single model, even when starting from different initializations. Testing on various medical imaging analysis tasks, we show that our merged model can achieve significant performance gains, with up to 3% improvement on the F1 score. The code implementation of this work will be available at www.github.com/BioMedIA-MBZUAI/MedMerge.
翻译:迁移学习已成为初始化深度学习模型以实现更快收敛和更高性能的强大工具,这在数据稀缺限制了深度学习模型性能提升空间的医学影像分析领域尤为实用。目前已有研究通过合并基于相同初始化的模型来提升迁移学习性能。然而在医学影像分析领域,存在合并不同初始化模型的机会,从而整合从不同任务中习得的特征。本文提出MedMerge方法,该方法能够合并不同模型的权重,并有效利用其特征提升新任务的性能。通过MedMerge,我们学习可用的核级权重,即使模型初始参数不同,也能将其合并为单一模型。在多个医学影像分析任务上的测试表明,我们的合并模型可实现显著的性能提升,F1分数最高提升3%。本研究的代码实现将发布在www.github.com/BioMedIA-MBZUAI/MedMerge。