While content-based image retrieval (CBIR) has been extensively studied in natural image retrieval, its application to medical images presents ongoing challenges, primarily due to the 3D nature of medical images. Recent studies have shown the potential use of pre-trained vision embeddings for CBIR in the context of radiology image retrieval. However, a benchmark for the retrieval of 3D volumetric medical images is still lacking, hindering the ability to objectively evaluate and compare the efficiency of proposed CBIR approaches in medical imaging. In this study, we extend previous work and establish a benchmark for region-based and multi-organ retrieval using the TotalSegmentator dataset (TS) with detailed multi-organ annotations. We benchmark embeddings derived from pre-trained supervised models on medical images against embeddings derived from pre-trained unsupervised models on non-medical images for 29 coarse and 104 detailed anatomical structures in volume and region levels. We adopt a late interaction re-ranking method inspired by text matching for image retrieval, comparing it against the original method proposed for volume and region retrieval achieving retrieval recall of 1.0 for diverse anatomical regions with a wide size range. The findings and methodologies presented in this paper provide essential insights and benchmarks for the development and evaluation of CBIR approaches in the context of medical imaging.
翻译:尽管基于内容的图像检索(CBIR)在自然图像检索领域已得到广泛研究,但其在医学图像中的应用仍面临持续挑战,主要源于医学图像的三维特性。近期研究表明,预训练的视觉嵌入在放射学图像检索CBIR中具有潜在应用价值。然而,针对三维体积医学图像检索的基准测试仍然缺失,这制约了客观评估和比较医学成像领域CBIR方法效能的能力。本研究在前期工作基础上,利用包含详细多器官标注的TotalSegmentator数据集(TS),建立了基于区域和多器官检索的基准测试体系。我们分别采用体积和区域两个层面,针对29个粗粒度解剖结构和104个细粒解剖结构,系统比较了基于医学图像预训练监督模型与基于非医学图像预训练无监督模型生成的嵌入向量。受文本匹配启发,我们提出了一种适用于图像检索的后期交互重排序方法,并与原始方法在体积和区域检索任务中进行对比,对涵盖广泛尺寸范围的不同解剖区域实现了1.0的检索召回率。本文提出的研究结果与方法为医学成像CBIR方法的开发与评估提供了关键见解和基准参考。