UAE: Universal Anatomical Embedding on Multi-modality Medical Images

Identifying specific anatomical structures (\textit{e.g.}, lesions or landmarks) in medical images plays a fundamental role in medical image analysis. Exemplar-based landmark detection methods are receiving increasing attention since they can detect arbitrary anatomical points in inference while do not need landmark annotations in training. They use self-supervised learning to acquire a discriminative embedding for each voxel within the image. These approaches can identify corresponding landmarks through nearest neighbor matching and has demonstrated promising results across various tasks. However, current methods still face challenges in: (1) differentiating voxels with similar appearance but different semantic meanings (\textit{e.g.}, two adjacent structures without clear borders); (2) matching voxels with similar semantics but markedly different appearance (\textit{e.g.}, the same vessel before and after contrast injection); and (3) cross-modality matching (\textit{e.g.}, CT-MRI landmark-based registration). To overcome these challenges, we propose universal anatomical embedding (UAE), which is a unified framework designed to learn appearance, semantic, and cross-modality anatomical embeddings. Specifically, UAE incorporates three key innovations: (1) semantic embedding learning with prototypical contrastive loss; (2) a fixed-point-based matching strategy; and (3) an iterative approach for cross-modality embedding learning. We thoroughly evaluated UAE across intra- and inter-modality tasks, including one-shot landmark detection, lesion tracking on longitudinal CT scans, and CT-MRI affine/rigid registration with varying field of view. Our results suggest that UAE outperforms state-of-the-art methods, offering a robust and versatile approach for landmark based medical image analysis tasks. Code and trained models are available at: \href{https://shorturl.at/bgsB3}

翻译：识别医学图像中的特定解剖结构（如病变或标志点）是医学图像分析的基础任务。基于示例的标志点检测方法正受到越来越多的关注，因为这类方法在推理时能检测任意解剖点，且训练过程中无需标志点标注。它们通过自监督学习为图像中的每个体素获取具有判别性的嵌入表示。这些方法可通过最近邻匹配识别对应标志点，并在多种任务中展现出良好效果。然而，当前方法仍面临以下挑战：（1）区分外观相似但语义不同的体素（例如两个无清晰边界的相邻结构）；（2）匹配语义相似但外观显著不同的体素（例如造影剂注射前后的同一血管）；（3）跨模态匹配（例如CT-MRI标志点配准）。为克服这些挑战，我们提出通用解剖嵌入（UAE），这是一个旨在学习外观、语义和跨模态解剖嵌入的统一框架。具体而言，UAE包含三项关键创新：（1）基于原型对比损失的语义嵌入学习；（2）基于固定点的匹配策略；（3）跨模态嵌入学习的迭代方法。我们在模态内和跨模态任务上全面评估了UAE，包括单次标志点检测、纵向CT扫描中的病变追踪，以及不同视野下的CT-MRI仿射/刚性配准。结果表明，UAE优于现有最先进方法，为基于标志点的医学图像分析任务提供了稳健且通用的方案。代码与预训练模型已开源：\href{https://shorturl.at/bgsB3}