Content-based image retrieval (CBIR) has the potential to significantly improve diagnostic aid and medical research in radiology. Current CBIR systems face limitations due to their specialization to certain pathologies, limiting their utility. In response, we propose using vision foundation models as powerful and versatile off-the-shelf feature extractors for content-based medical image retrieval. By benchmarking these models on a comprehensive dataset of 1.6 million 2D radiological images spanning four modalities and 161 pathologies, we identify weakly-supervised models as superior, achieving a P@1 of up to 0.594. This performance not only competes with a specialized model but does so without the need for fine-tuning. Our analysis further explores the challenges in retrieving pathological versus anatomical structures, indicating that accurate retrieval of pathological features presents greater difficulty. Despite these challenges, our research underscores the vast potential of foundation models for CBIR in radiology, proposing a shift towards versatile, general-purpose medical image retrieval systems that do not require specific tuning.
翻译:基于内容的图像检索(CBIR)有望显著提升放射学诊断辅助和医学研究能力。当前CBIR系统受限于对特定病理的专门化设计,导致其应用范围受限。为此,我们提出利用视觉基础模型作为功能强大且通用的现成特征提取器,用于基于内容的医学图像检索。通过在包含160万张二维放射图像(涵盖四种模态和161种病理类型)的综合数据集上对这些模型进行基准测试,我们发现弱监督模型表现优异,P@1值最高可达0.594。这一性能不仅可与专用模型相媲美,且无需微调。我们的分析进一步探讨了检索病理结构相较于解剖结构的挑战,表明病理特征的准确检索更具难度。尽管存在这些挑战,我们的研究强调了基础模型在放射学CBIR中的巨大潜力,提出了向无需特定调优的通用医学图像检索系统转变的方向。