Despite that deep learning (DL) methods have presented tremendous potential in many medical image analysis tasks, the practical applications of medical DL models are limited due to the lack of enough data samples with manual annotations. By noting that the clinical radiology examinations are associated with radiology reports that describe the images, we propose to develop a foundation model for multi-model head MRI by using contrastive learning on the images and the corresponding radiology findings. In particular, a contrastive learning framework is proposed, where a mixed syntax and semantic similarity matching metric is integrated to reduce the thirst of extreme large dataset in conventional contrastive learning framework. Our proposed similarity enhanced contrastive language image pretraining (SeLIP) is able to effectively extract more useful features. Experiments revealed that our proposed SeLIP performs well in many downstream tasks including image-text retrieval task, classification task, and image segmentation, which highlights the importance of considering the similarities among texts describing different images in developing medical image foundation models.
翻译:尽管深度学习方法已在众多医学图像分析任务中展现出巨大潜力,但由于缺乏足量人工标注数据样本,医学深度学习模型的实际应用仍受限制。鉴于临床放射学检查通常伴随描述图像的放射学报告,本文提出通过对图像及对应放射学报告进行对比学习,构建面向多模态头部MRI的基础模型。具体而言,我们提出一种融合混合语法与语义相似度匹配度量的对比学习框架,以缓解传统对比学习框架对超大规模数据集的依赖。所提出的相似性增强对比语言-图像预训练方法能够有效提取更具价值的特征。实验表明,该方法在图文检索、分类及图像分割等下游任务中均表现优异,这凸显了在构建医学图像基础模型时考虑不同图像描述文本间相似性的重要性。