MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image Diagnosis

The common practice in developing computer-aided diagnosis (CAD) models based on transformer architectures usually involves fine-tuning from ImageNet pre-trained weights. However, with recent advances in large-scale pre-training and the practice of scaling laws, Vision Transformers (ViT) have become much larger and less accessible to medical imaging communities. Additionally, in real-world scenarios, the deployments of multiple CAD models can be troublesome due to problems such as limited storage space and time-consuming model switching. To address these challenges, we propose a new method MeLo (Medical image Low-rank adaptation), which enables the development of a single CAD model for multiple clinical tasks in a lightweight manner. It adopts low-rank adaptation instead of resource-demanding fine-tuning. By fixing the weight of ViT models and only adding small low-rank plug-ins, we achieve competitive results on various diagnosis tasks across different imaging modalities using only a few trainable parameters. Specifically, our proposed method achieves comparable performance to fully fine-tuned ViT models on four distinct medical imaging datasets using about 0.17% trainable parameters. Moreover, MeLo adds only about 0.5MB of storage space and allows for extremely fast model switching in deployment and inference. Our source code and pre-trained weights are available on our website (https://absterzhu.github.io/melo.github.io/).

翻译：基于Transformer架构开发计算机辅助诊断（CAD）模型的常见做法通常是从ImageNet预训练权重进行微调。然而，随着大规模预训练的最新进展和规模定律的实践，视觉Transformer（ViT）模型变得日益庞大，导致医学影像社区难以获取。此外，在真实场景中，受限于存储空间不足和模型切换耗时等问题，多个CAD模型的部署可能带来诸多不便。针对上述挑战，我们提出了一种新方法MeLo（医学影像低秩适配），该方法能够以轻量级方式为多种临床任务开发单一CAD模型。它采用低秩适配替代资源密集型的微调策略。通过固定ViT模型权重并仅添加小型低秩插件，我们在不同成像模态的多种诊断任务中仅使用少量可训练参数便取得了具有竞争力的结果。具体而言，我们提出的方法在四个不同医学影像数据集上使用约0.17%的可训练参数即可实现与完全微调ViT模型相当的性能。此外，MeLo仅需增加约0.5MB的存储空间，并在部署和推理过程中支持极快的模型切换。我们的源代码和预训练权重已发布在官方网站上（https://absterzhu.github.io/melo.github.io/）。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日