Mask-HybridGNet: Graph-based segmentation with emergent anatomical correspondence from pixel-level supervision

Graph-based medical image segmentation represents anatomical structures using boundary graphs, providing fixed-topology landmarks and inherent population-level correspondences. However, their clinical adoption has been hindered by a major requirement: training datasets with manually annotated landmarks that maintain point-to-point correspondences across patients rarely exist in practice. We introduce Mask-HybridGNet, a framework that trains graph-based models directly using standard pixel-wise masks, eliminating the need for manual landmark annotations. Our approach aligns variable-length ground truth boundaries with fixed-length landmark predictions by combining Chamfer distance supervision and edge-based regularization to ensure local smoothness and regular landmark distribution, further refined via differentiable rasterization. A significant emergent property of this framework is that predicted landmark positions become consistently associated with specific anatomical locations across patients without explicit correspondence supervision. This implicit atlas learning enables temporal tracking, cross-slice reconstruction, and morphological population analyses. Beyond direct segmentation, Mask-HybridGNet can extract correspondences from existing segmentation masks, allowing it to generate stable anatomical atlases from any high-quality pixel-based model. Experiments across chest radiography, cardiac ultrasound, cardiac MRI, and fetal imaging demonstrate that our model achieves competitive results against state-of-the-art pixel-based methods, while ensuring anatomical plausibility by enforcing boundary connectivity through a fixed graph adjacency matrix. This framework leverages the vast availability of standard segmentation masks to build structured models that maintain topological integrity and provide implicit correspondences.

翻译：基于图的医学图像分割方法利用边界图表示解剖结构，提供固定拓扑的解剖标志点以及固有的群体级对应关系。然而，其临床应用一直受到一个主要需求的限制：在实践中，很少存在包含人工标注标志点且保持患者间点对点对应关系的训练数据集。我们提出了Mask-HybridGNet，该框架可直接使用标准的像素级掩码训练基于图的模型，从而无需人工标注解剖标志点。我们的方法通过结合Chamfer距离监督和基于边的正则化，将可变长度的真实边界与固定长度的标志点预测对齐，以确保局部平滑性和规则的标志点分布，并进一步通过可微分栅格化进行优化。该框架的一个重要涌现特性是：预测的标志点位置在不同患者间能够稳定地与特定解剖位置相关联，而无需显式的对应关系监督。这种隐式的图谱学习能力支持时序跟踪、跨切片重建以及形态学群体分析。除了直接分割外，Mask-HybridGNet还能从现有的分割掩码中提取对应关系，使其能够基于任何高质量的像素级模型生成稳定的解剖图谱。在胸部X光、心脏超声、心脏MRI和胎儿成像等多个数据集上的实验表明，我们的模型在达到与先进像素级方法相当性能的同时，通过固定图邻接矩阵强制边界连通性，确保了解剖结构的合理性。该框架充分利用了广泛可用的标准分割掩码，构建了能够保持拓扑完整性并提供隐式对应关系的结构化模型。