The universal model emerges as a promising trend for medical image segmentation, paving up the way to build medical imaging large model (MILM). One popular strategy to build universal models is to encode each task as a one-hot vector and generate dynamic convolutional layers at the end of the decoder to extract the interested target. Although successful, it ignores the correlations among tasks and meanwhile is too late to make the model 'aware' of the ongoing task. To address both issues, we propose a prompt-driven Universal Segmentation model (UniSeg) for multi-task medical image segmentation using diverse modalities and domains. We first devise a learnable universal prompt to describe the correlations among all tasks and then convert this prompt and image features into a task-specific prompt, which is fed to the decoder as a part of its input. Thus, we make the model 'aware' of the ongoing task early and boost the task-specific training of the whole decoder. Our results indicate that the proposed UniSeg outperforms other universal models and single-task models on 11 upstream tasks. Moreover, UniSeg also beats other pre-trained models on two downstream datasets, providing the community with a high-quality pre-trained model for 3D medical image segmentation. Code and model are available at https://github.com/yeerwen/UniSeg.
翻译:通用模型正成为医学图像分割领域的一个有前景的趋势,为构建医学影像大模型(MILM)铺平了道路。构建通用模型的一种常见策略是将每个任务编码为独热向量,并在解码器末端生成动态卷积层以提取感兴趣目标。尽管该方法取得了成功,但它忽视了任务之间的相关性,同时使得模型“感知”当前任务的时机过晚。为解决这两个问题,我们提出了一种基于提示驱动的通用分割模型(UniSeg),用于处理多模态、多领域的多任务医学图像分割。我们首先设计了一个可学习的通用提示来描述所有任务之间的相关性,然后将该提示与图像特征转换为特定任务的提示,并将其作为解码器输入的一部分馈入。由此,我们使模型能够早期“感知”当前任务,并增强整个解码器的任务特定训练。实验结果表明,所提出的UniSeg在11个上游任务上优于其他通用模型和单任务模型。此外,UniSeg在两个下游数据集上也优于其他预训练模型,为社区提供了3D医学图像分割的高质量预训练模型。代码和模型可在https://github.com/yeerwen/UniSeg获取。