Great progress has been made in automatic medical image segmentation due to powerful deep representation learning. The influence of transformer has led to research into its variants, and large-scale replacement of traditional CNN modules. However, such trend often overlooks the intrinsic feature extraction capabilities of the transformer and potential refinements to both the model and the transformer module through minor adjustments. This study proposes a novel deep medical image segmentation framework, called DA-TransUNet, aiming to introduce the Transformer and dual attention block into the encoder and decoder of the traditional U-shaped architecture. Unlike prior transformer-based solutions, our DA-TransUNet utilizes attention mechanism of transformer and multifaceted feature extraction of DA-Block, which can efficiently combine global, local, and multi-scale features to enhance medical image segmentation. Meanwhile, experimental results show that a dual attention block is added before the Transformer layer to facilitate feature extraction in the U-net structure. Furthermore, incorporating dual attention blocks in skip connections can enhance feature transfer to the decoder, thereby improving image segmentation performance. Experimental results across various benchmark of medical image segmentation reveal that DA-TransUNet significantly outperforms the state-of-the-art methods. The codes and parameters of our model will be publicly available at https://github.com/SUN-1024/DA-TransUnet.
翻译:自动医学图像分割因强大的深度表示学习取得了显著进展。Transformer的影响促使研究者探索其变体,并大规模替代传统CNN模块。然而,这种趋势往往忽视了Transformer自身的内在特征提取能力,以及通过微小调整对模型和Transformer模块的潜在优化。本研究提出了一种新颖的深度医学图像分割框架——DA-TransUNet,旨在将Transformer和双注意力模块引入传统U型架构的编码器与解码器中。与以往基于Transformer的方案不同,我们的DA-TransUNet利用Transformer的注意力机制和DA-Block的多方面特征提取能力,能够高效融合全局、局部及多尺度特征,从而增强医学图像分割性能。同时,实验结果表明,在Transformer层前添加双注意力模块可促进U-Net结构中的特征提取。此外,在跳跃连接中融入双注意力模块能够增强向解码器传递的特征,进而提升图像分割性能。在多个医学图像分割基准上的实验结果显示,DA-TransUNet显著优于当前最先进的方法。本模型的代码和参数将在https://github.com/SUN-1024/DA-TransUnet 公开提供。