While content-based image retrieval (CBIR) has been extensively studied in natural image retrieval, its application to medical images presents ongoing challenges, primarily due to the 3D nature of medical images. Recent studies have shown the potential use of pre-trained vision embeddings for CBIR in the context of radiology image retrieval. However, a benchmark for the retrieval of 3D volumetric medical images is still lacking, hindering the ability to objectively evaluate and compare the efficiency of proposed CBIR approaches in medical imaging. In this study, we extend previous work and establish a benchmark for region-based and localized multi-organ retrieval using the TotalSegmentator dataset (TS) with detailed multi-organ annotations. We benchmark embeddings derived from pre-trained supervised models on medical images against embeddings derived from pre-trained unsupervised models on non-medical images for 29 coarse and 104 detailed anatomical structures in volume and region levels. For volumetric image retrieval, we adopt a late interaction re-ranking method inspired by text matching. We compare it against the original method proposed for volume and region retrieval and achieve a retrieval recall of 1.0 for diverse anatomical regions with a wide size range. The findings and methodologies presented in this paper provide insights and benchmarks for further development and evaluation of CBIR approaches in the context of medical imaging.
翻译:尽管基于内容的图像检索(CBIR)在自然图像检索领域已得到广泛研究,但其在医学影像中的应用仍面临持续挑战,这主要源于医学影像的三维特性。近期研究表明,预训练的视觉嵌入在放射影像检索的CBIR应用中具有潜力。然而,针对三维体数据医学影像检索的基准数据集仍然缺乏,这阻碍了对医学影像中CBIR方法效率进行客观评估和比较的能力。本研究在先前工作基础上,利用具有精细多器官标注的TotalSegmentator数据集(TS),建立了基于区域和局部多器官检索的基准。我们在体数据层面和区域层面,针对29个粗粒度及104个细粒度解剖结构,对基于医学影像预训练监督模型生成的嵌入与基于非医学影像预训练无监督模型生成的嵌入进行了基准测试。对于体数据图像检索,我们采用了一种受文本匹配启发的延迟交互重排序方法。通过将其与原始针对体数据和区域检索的方法进行比较,我们在尺寸范围广泛的多种解剖区域上实现了1.0的检索召回率。本文提出的研究结果与方法为医学影像领域CBIR方法的进一步发展和评估提供了见解与基准。