In medical image analysis, achieving fast, efficient, and accurate segmentation is essential for automated diagnosis and treatment. Although recent advancements in deep learning have significantly improved segmentation accuracy, current models often face challenges in adaptability and generalization, particularly when processing multi-modal medical imaging data. These limitations stem from the substantial variations between imaging modalities and the inherent complexity of medical data. To address these challenges, we propose the Strategy-driven Interactive Segmentation Model (SISeg), built on SAM2, which enhances segmentation performance across various medical imaging modalities by integrating a selection engine. To mitigate memory bottlenecks and optimize prompt frame selection during the inference of 2D image sequences, we developed an automated system, the Adaptive Frame Selection Engine (AFSE). This system dynamically selects the optimal prompt frames without requiring extensive prior medical knowledge and enhances the interpretability of the model's inference process through an interactive feedback mechanism. We conducted extensive experiments on 10 datasets covering 7 representative medical imaging modalities, demonstrating the SISeg model's robust adaptability and generalization in multi-modal tasks. The project page and code will be available at: [URL].
翻译:在医学影像分析中,实现快速、高效且准确的分割对于自动化诊断与治疗至关重要。尽管深度学习的最新进展显著提升了分割精度,但现有模型在处理多模态医学影像数据时,常面临适应性与泛化能力方面的挑战。这些局限性源于成像模态间的显著差异以及医学数据固有的复杂性。为应对这些挑战,我们提出了基于SAM2构建的策略驱动交互式分割模型(SISeg),该模型通过集成选择引擎,提升了跨多种医学影像模态的分割性能。为缓解二维图像序列推理过程中的内存瓶颈并优化提示帧选择,我们开发了自适应帧选择引擎(AFSE)自动化系统。该系统无需大量先验医学知识即可动态选择最优提示帧,并通过交互式反馈机制增强了模型推理过程的可解释性。我们在涵盖7种代表性医学成像模态的10个数据集上进行了广泛实验,证明了SISeg模型在多模态任务中具备强大的适应性与泛化能力。项目页面与代码将发布于:[URL]。