Accurately localizing and identifying vertebrae from CT images is crucial for various clinical applications. However, most existing efforts are performed on 3D with cropping patch operation, suffering from the large computation costs and limited global information. In this paper, we propose a multi-view vertebra localization and identification from CT images, converting the 3D problem into a 2D localization and identification task on different views. Without the limitation of the 3D cropped patch, our method can learn the multi-view global information naturally. Moreover, to better capture the anatomical structure information from different view perspectives, a multi-view contrastive learning strategy is developed to pre-train the backbone. Additionally, we further propose a Sequence Loss to maintain the sequential structure embedded along the vertebrae. Evaluation results demonstrate that, with only two 2D networks, our method can localize and identify vertebrae in CT images accurately, and outperforms the state-of-the-art methods consistently. Our code is available at https://github.com/ShanghaiTech-IMPACT/Multi-View-Vertebra-Localization-and-Identification-from-CT-Images.
翻译:准确地对CT图像中的椎骨进行定位和识别对于多种临床应用至关重要。然而,现有方法大多采用三维裁剪补丁操作,存在计算成本高且全局信息受限的问题。本文提出一种基于多视角的CT图像椎骨定位与识别方法,将三维问题转化为不同视角上的二维定位与识别任务。该方法不受三维裁剪补丁的限制,能够自然地学习多视角全局信息。此外,为更好地从不同视角捕捉解剖结构信息,我们开发了一种多视角对比学习策略对主干网络进行预训练。同时,我们进一步提出序列损失函数以保持椎骨沿脊柱嵌入的序列结构。评估结果表明,仅使用两个二维网络,该方法即可准确完成CT图像中椎骨的定位与识别,并持续超越现有最优方法。我们的代码已开源在 https://github.com/ShanghaiTech-IMPACT/Multi-View-Vertebra-Localization-and-Identification-from-CT-Images。