Foundation models pre-trained on large-scale data have been widely witnessed to achieve success in various natural imaging downstream tasks. Parameter-efficient fine-tuning (PEFT) methods aim to adapt foundation models to new domains by updating only a small portion of parameters in order to reduce computational overhead. However, the effectiveness of these PEFT methods, especially in cross-domain few-shot scenarios, e.g., medical image analysis, has not been fully explored. In this work, we facilitate the study of the performance of PEFT when adapting foundation models to medical image classification tasks. Furthermore, to alleviate the limitations of prompt introducing ways and approximation capabilities on Transformer architectures of mainstream prompt tuning methods, we propose the Embedded Prompt Tuning (EPT) method by embedding prompt tokens into the expanded channels. We also find that there are anomalies in the feature space distribution of foundation models during pre-training process, and prompt tuning can help mitigate this negative impact. To explain this phenomenon, we also introduce a novel perspective to understand prompt tuning: Prompt tuning is a distribution calibrator. And we support it by analyzing patch-wise scaling and feature separation operations contained in EPT. Our experiments show that EPT outperforms several state-of-the-art fine-tuning methods by a significant margin on few-shot medical image classification tasks, and completes the fine-tuning process within highly competitive time, indicating EPT is an effective PEFT method. The source code is available at github.com/zuwenqiang/EPT.
翻译:在大规模数据上预训练的基础模型已被广泛证实能在多种自然图像下游任务中取得成功。参数高效微调方法旨在通过仅更新一小部分参数,将基础模型适配到新领域,从而降低计算开销。然而,这些PEFT方法尤其在跨领域少样本场景(例如医学图像分析)中的有效性尚未得到充分探索。在本工作中,我们推动了基础模型适配至医学图像分类任务时PEFT性能的研究。此外,为缓解主流提示调优方法在提示引入方式及对Transformer架构近似能力上的局限性,我们提出了嵌入式提示调优方法,通过将提示令牌嵌入扩展通道中。我们还发现基础模型在预训练过程中特征空间分布存在异常,而提示调优有助于减轻这种负面影响。为解释这一现象,我们引入了一种理解提示调优的新视角:提示调优是一种分布校准器。我们通过分析EPT中包含的块级缩放和特征分离操作来支持这一观点。实验表明,在少样本医学图像分类任务上,EPT显著优于多种最先进的微调方法,并在极具竞争力的时间内完成微调过程,表明EPT是一种有效的PEFT方法。源代码发布于github.com/zuwenqiang/EPT。