RWKV-UNet：通过长程协作改进UNet以实现有效的医学图像分割 (RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation)

In recent years, significant advancements have been made in deep learning for medical image segmentation, particularly with convolutional neural networks (CNNs) and transformer models. However, CNNs face limitations in capturing long-range dependencies, while transformers suffer from high computational complexity. To address this, we propose RWKV-UNet, a novel model that integrates the RWKV (Receptance Weighted Key Value) structure into the U-Net architecture. This integration enhances the model's ability to capture long-range dependencies and to improve contextual understanding, which is crucial for accurate medical image segmentation. We build a strong encoder with developed Global-Local Spatial Perception (GLSP) blocks combining CNNs and RWKVs. We also propose a Cross-Channel Mix (CCM) module to improve skip connections with multi-scale feature fusion, achieving global channel information integration. Experiments on 11 benchmark datasets show that the RWKV-UNet achieves state-of-the-art performance on various types of medical image segmentation tasks. Additionally, smaller variants, RWKV-UNet-S and RWKV-UNet-T, balance accuracy and computational efficiency, making them suitable for broader clinical applications.

翻译：近年来，深度学习在医学图像分割领域取得了显著进展，特别是卷积神经网络（CNNs）和Transformer模型的应用。然而，CNNs在捕获长程依赖关系方面存在局限性，而Transformer模型则面临计算复杂度高的问题。为解决这些问题，我们提出了RWKV-UNet，这是一种将RWKV（Receptance Weighted Key Value）结构集成到U-Net架构中的新型模型。这种集成增强了模型捕获长程依赖关系和提升上下文理解的能力，这对于精确的医学图像分割至关重要。我们构建了一个强大的编码器，其中采用了结合CNNs和RWKVs的全局-局部空间感知（GLSP）模块。我们还提出了跨通道混合（CCM）模块，通过多尺度特征融合改进跳跃连接，实现了全局通道信息的整合。在11个基准数据集上的实验表明，RWKV-UNet在多种类型的医学图像分割任务中达到了最先进的性能。此外，较小的变体RWKV-UNet-S和RWKV-UNet-T在准确性和计算效率之间取得了平衡，使其适用于更广泛的临床应用。