U-Net is currently the most widely used architecture for medical image segmentation. Benefiting from its unique encoder-decoder architecture and skip connections, it can effectively extract features from input images to segment target regions. The commonly used U-Net is typically based on convolutional operations or Transformers, modeling the dependencies between local or global information to accomplish medical image analysis tasks. However, convolutional layers, fully connected layers, and attention mechanisms used in this process introduce a significant number of parameters, often requiring the stacking of network layers to model complex nonlinear relationships, which can impact the training process. To address these issues, we propose TransUKAN. Specifically, we have improved the KAN to reduce memory usage and computational load. On this basis, we explored an effective combination of KAN, Transformer, and U-Net structures. This approach enhances the model's capability to capture nonlinear relationships by introducing only a small number of additional parameters and compensates for the Transformer structure's deficiency in local information extraction. We validated TransUKAN on multiple medical image segmentation tasks. Experimental results demonstrate that TransUKAN achieves excellent performance with significantly reduced parameters. The code will be available athttps://github.com/wuyanlin-wyl/TransUKAN.
翻译:U-Net是目前医学图像分割领域应用最广泛的架构。得益于其独特的编码器-解码器架构与跳跃连接设计,它能够有效提取输入图像特征以分割目标区域。常用的U-Net通常基于卷积运算或Transformer,通过建模局部或全局信息间的依赖关系来完成医学图像分析任务。然而,该过程中使用的卷积层、全连接层和注意力机制会引入大量参数,且往往需要堆叠网络层数以建模复杂的非线性关系,这可能影响训练过程。为解决这些问题,我们提出了TransUKAN。具体而言,我们改进了KAN以减少内存占用和计算负荷。在此基础上,我们探索了KAN、Transformer与U-Net结构的有效结合方式。该方法仅通过引入少量额外参数就增强了模型捕获非线性关系的能力,并弥补了Transformer结构在局部信息提取方面的不足。我们在多个医学图像分割任务上验证了TransUKAN的有效性。实验结果表明,TransUKAN在显著减少参数量的同时取得了优异的性能。代码将在https://github.com/wuyanlin-wyl/TransUKAN 公开。