Traditionally for improving the segmentation performance of models, most approaches prefer to use adding more complex modules. And this is not suitable for the medical field, especially for mobile medical devices, where computationally loaded models are not suitable for real clinical environments due to computational resource constraints. Recently, state-space models (SSMs), represented by Mamba, have become a strong competitor to traditional CNNs and Transformers. In this paper, we deeply explore the key elements of parameter influence in Mamba and propose an UltraLight Vision Mamba UNet (UltraLight VM-UNet) based on this. Specifically, we propose a method for processing features in parallel Vision Mamba, named PVM Layer, which achieves excellent performance with the lowest computational load while keeping the overall number of processing channels constant. We conducted comparisons and ablation experiments with several state-of-the-art lightweight models on three skin lesion public datasets and demonstrated that the UltraLight VM-UNet exhibits the same strong performance competitiveness with parameters of only 0.049M and GFLOPs of 0.060. In addition, this study deeply explores the key elements of parameter influence in Mamba, which will lay a theoretical foundation for Mamba to possibly become a new mainstream module for lightweighting in the future. The code is available from https://github.com/wurenkai/UltraLight-VM-UNet .
翻译:传统上,为提升模型分割性能,多数方法倾向于添加更复杂的模块。这并不适用于医学领域,尤其是移动医疗设备,因为计算负载模型受计算资源限制难以适应真实临床环境。近期,以Mamba为代表的状态空间模型(SSMs)已成为传统CNN和Transformer的有力竞争者。本文深入探索了Mamba中影响参数的关键要素,并据此提出超轻量视觉曼巴UNet(UltraLight VM-UNet)。具体而言,我们提出了一种并行视觉曼巴特征处理方法,命名为PVM层,该方法在保持总处理通道数不变的情况下,以最低计算负载实现了卓越性能。我们在三个皮肤病变公开数据集上与多个最先进轻量级模型进行了对比和消融实验,证明UltraLight VM-UNet在参数量仅0.049M、GFLOPs仅0.060时展现出同等强劲的性能竞争力。此外,本研究深入探索了Mamba中影响参数的关键要素,这将为Mamba未来可能成为轻量化的新主流模块奠定理论基础。代码地址:https://github.com/wurenkai/UltraLight-VM-UNet。