Unified Medical Image-Text-Label Contrastive Learning With Continuous Prompt

Contrastive language-image Pre-training (CLIP) [13] can leverage large datasets of unlabeled Image-Text pairs, which have demonstrated impressive performance in various downstream tasks. Given that annotating medical data is time-consuming and laborious, Image-Text Pre-training has promising applications in exploiting large-scale medical image and radiology report datasets. However, medical Image-Text Pre-training faces several challenges, as follows: (1) Due to privacy concerns, the amount of available medical data is relatively small compared to natural data, leading to weaker generalization ability of the model. (2) Medical images are highly similar with only fine-grained differences in subtleties, resulting in a large number of false-negative sample pairs in comparison learning. (3) The hand-crafted Prompt usually differs from the natural medical image report, Subtle changes in wording can lead to significant differences in performance. In this paper, we propose a unified Image-Text-Label contrastive learning framework based on continuous prompts, with three main contributions. First, We unified the data of images, text, and labels, which greatly expanded the training data that the model could utilize. Second, we address the issue of data diversity and the impact of hand-crafted prompts on model performance by introducing continuous implicit prompts. Lastly, we propose a ImageText-Label contrastive Training to mitigate the problem of too many false-negative samples. We demonstrate through sufficient experiments that the Unified Medical Contrastive Learning (UMCL) framework exhibits excellent performance on several downstream tasks.

翻译：对比语言-图像预训练（CLIP）[13]可利用大量无标注的图像-文本对数据，在多种下游任务中展现出显著性能。鉴于医学数据标注耗时费力，图像-文本预训练在利用大规模医学图像和放射学报告数据集方面具有广阔应用前景。然而，医学图像-文本预训练面临以下挑战：（1）受隐私问题限制，可用医学数据量相对自然数据较少，导致模型泛化能力较弱；（2）医学图像高度相似，仅存在细微的精细差异，导致对比学习中产生大量假负样本对；（3）手工设计的提示通常与自然医学图像报告存在差异，措辞的细微变化可能导致性能显著差异。本文提出一种基于连续提示的统一图像-文本-标签对比学习框架，主要贡献有三点：首先，我们将图像、文本和标签数据统一，极大扩展了模型可利用的训练数据；其次，通过引入连续隐式提示，解决了数据多样性及手工提示对模型性能影响的问题；最后，提出图像-文本-标签对比训练方法，以缓解假负样本过多的问题。通过充分实验证明，统一医学对比学习（UMCL）框架在多个下游任务上均展现出优异性能。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日