Revisiting Fine-Tuning Strategies for Self-supervised Medical Imaging Analysis

from arxiv, Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 18 pages

Despite the rapid progress in self-supervised learning (SSL), end-to-end fine-tuning still remains the dominant fine-tuning strategy for medical imaging analysis. However, it remains unclear whether this approach is truly optimal for effectively utilizing the pre-trained knowledge, especially considering the diverse categories of SSL that capture different types of features. In this paper, we present the first comprehensive study that discovers effective fine-tuning strategies for self-supervised learning in medical imaging. After developing strong contrastive and restorative SSL baselines that outperform SOTA methods across four diverse downstream tasks, we conduct an extensive fine-tuning analysis across multiple pre-training and fine-tuning datasets, as well as various fine-tuning dataset sizes. Contrary to the conventional wisdom of fine-tuning only the last few layers of a pre-trained network, we show that fine-tuning intermediate layers is more effective, with fine-tuning the second quarter (25-50%) of the network being optimal for contrastive SSL whereas fine-tuning the third quarter (50-75%) of the network being optimal for restorative SSL. Compared to the de-facto standard of end-to-end fine-tuning, our best fine-tuning strategy, which fine-tunes a shallower network consisting of the first three quarters (0-75%) of the pre-trained network, yields improvements of as much as 5.48%. Additionally, using these insights, we propose a simple yet effective method to leverage the complementary strengths of multiple SSL models, resulting in enhancements of up to 3.57% compared to using the best model alone. Hence, our fine-tuning strategies not only enhance the performance of individual SSL models, but also enable effective utilization of the complementary strengths offered by multiple SSL models, leading to significant improvements in self-supervised medical imaging analysis.

翻译：尽管自监督学习（SSL）取得了快速进展，但端到端微调仍然是医学影像分析中占主导地位的微调策略。然而，该方法是否真正最优地利用了预训练知识，尤其是在考虑捕捉不同类型特征的多样化SSL类别时，仍不明确。本文首次系统性地探索了医学影像自监督学习中的有效微调策略。在开发出强对比性和修复性SSL基线方法，并在四个不同的下游任务上超越现有最优方法后，我们对多个预训练和微调数据集以及不同规模的微调数据集进行了广泛的微调分析。与传统观点认为只需微调预训练网络的最后几层不同，我们发现微调中间层更为有效：对于对比性SSL，微调网络第二季度（25-50%）层为最优；而对于修复性SSL，微调网络第三季度（50-75%）层为最优。与去事实标准的端到端微调相比，我们提出的最优微调策略——微调包含预训练网络前四分之三（0-75%）的更浅层网络——可带来高达5.48%的性能提升。此外，基于这些发现，我们提出了一种简单而有效的方法来利用多个SSL模型的互补优势，与仅使用最佳单一模型相比，性能提升高达3.57%。因此，我们的微调策略不仅增强了单个SSL模型的性能，还能有效利用多个SSL模型提供的互补优势，从而显著改进自监督医学影像分析。