This work introduces a new framework, ProtoSAM, for one-shot medical image segmentation. It combines the use of prototypical networks, known for few-shot segmentation, with SAM - a natural image foundation model. The method proposed creates an initial coarse segmentation mask using the ALPnet prototypical network, augmented with a DINOv2 encoder. Following the extraction of an initial mask, prompts are extracted, such as points and bounding boxes, which are then input into the Segment Anything Model (SAM). State-of-the-art results are shown on several medical image datasets and demonstrate automated segmentation capabilities using a single image example (one shot) with no need for fine-tuning of the foundation model.
翻译:本研究提出了一种名为ProtoSAM的新型一次性医学图像分割框架。该框架将擅长少样本分割的原型网络与自然图像基础模型SAM相结合。所提出的方法首先使用ALPnet原型网络(辅以DINOv2编码器增强)生成初始的粗分割掩码。在提取初始掩码后,系统从中提取提示信息(如点和边界框),随后将这些提示输入Segment Anything Model(SAM)。该方法在多个医学图像数据集上取得了最先进的性能,并展示了仅需单张示例图像(一次性)即可实现自动分割的能力,且无需对基础模型进行微调。