Convolutional neural networks (CNN) and Transformer variants have emerged as the leading medical image segmentation backbones. Nonetheless, due to their limitations in either preserving global image context or efficiently processing irregular shapes in visual objects, these backbones struggle to effectively integrate information from diverse anatomical regions and reduce inter-individual variability, particularly for the vasculature. Motivated by the successful breakthroughs of graph neural networks (GNN) in capturing topological properties and non-Euclidean relationships across various fields, we propose NexToU, a novel hybrid architecture for medical image segmentation. NexToU comprises improved Pool GNN and Swin GNN modules from Vision GNN (ViG) for learning both global and local topological representations while minimizing computational costs. To address the containment and exclusion relationships among various anatomical structures, we reformulate the topological interaction (TI) module based on the nature of binary trees, rapidly encoding the topological constraints into NexToU. Extensive experiments conducted on three datasets (including distinct imaging dimensions, disease types, and imaging modalities) demonstrate that our method consistently outperforms other state-of-the-art (SOTA) architectures. All the code is publicly available at https://github.com/PengchengShi1220/NexToU.
翻译:卷积神经网络(CNN)与Transformer变体已成为医学图像分割的主流骨干网络。然而,由于它们在保留全局图像上下文或高效处理视觉对象不规则形状方面的局限性,这些骨干网络难以有效整合不同解剖区域的信息并降低个体间差异,尤其是在血管系统中。受图神经网络(GNN)在捕捉拓扑属性及非欧几里得关系方面取得突破性进展的启发,我们提出NexToU——一种用于医学图像分割的新型混合架构。NexToU包含由视觉图神经网络(ViG)改进而来的Pool GNN模块和Swin GNN模块,可在最小化计算成本的同时学习全局与局部拓扑表征。为解决不同解剖结构之间的包含与互斥关系,我们基于二叉树性质重新设计了拓扑交互(TI)模块,将拓扑约束快速编码至NexToU中。在三个数据集(涵盖不同成像维度、疾病类型及成像模态)上的大量实验表明,我们的方法始终优于其他最先进(SOTA)架构。所有代码已开源至https://github.com/PengchengShi1220/NexToU。