Automated medical image segmentation can assist doctors to diagnose faster and more accurate. Deep learning based models for medical image segmentation have made great progress in recent years. However, the existing models fail to effectively leverage Transformer and MLP for improving U-shaped architecture efficiently. In addition, the multi-scale features of the MLP have not been fully extracted in the bottleneck of U-shaped architecture. In this paper, we propose an efficient U-shaped architecture based on Swin Transformer and multi-scale MLP, namely STM-UNet. Specifically, the Swin Transformer block is added to skip connection of STM-UNet in form of residual connection, which can enhance the modeling ability of global features and long-range dependency. Meanwhile, a novel PCAS-MLP with parallel convolution module is designed and placed into the bottleneck of our architecture to contribute to the improvement of segmentation performance. The experimental results on ISIC 2016 and ISIC 2018 demonstrate the effectiveness of our proposed method. Our method also outperforms several state-of-the-art methods in terms of IoU and Dice. Our method has achieved a better trade-off between high segmentation accuracy and low model complexity.
翻译:自动化医学图像分割能辅助医生更快更准确地进行诊断。近年来,基于深度学习的医学图像分割模型取得了显著进展。然而,现有模型未能有效利用Transformer与MLP来高效改进U形架构。此外,在U形架构的瓶颈层中,MLP的多尺度特征尚未得到充分提取。本文提出一种基于Swin Transformer与多尺度MLP的高效U形架构,即STM-UNet。具体而言,Swin Transformer模块以残差连接形式嵌入STM-UNet的跳跃连接中,可增强全局特征与长程依赖的建模能力。同时,我们设计了一种新型带并行卷积模块的PCAS-MLP,并将其置于架构瓶颈层以提升分割性能。在ISIC 2016与ISIC 2018数据集上的实验结果证明了该方法的有效性。在IoU与Dice指标方面,本方法优于多种现有最先进方法,并实现了高分割精度与低模型复杂度之间的更优平衡。