A major challenge in computational research in 3D medical imaging is the lack of comprehensive datasets. Addressing this issue, our study introduces CT-RATE, the first 3D medical imaging dataset that pairs images with textual reports. CT-RATE consists of 25,692 non-contrast chest CT volumes, expanded to 50,188 through various reconstructions, from 21,304 unique patients, along with corresponding radiology text reports. Leveraging CT-RATE, we developed CT-CLIP, a CT-focused contrastive language-image pre-training framework. As a versatile, self-supervised model, CT-CLIP is designed for broad application and does not require task-specific training. Remarkably, CT-CLIP outperforms state-of-the-art, fully supervised methods in multi-abnormality detection across all key metrics, thus eliminating the need for manual annotation. We also demonstrate its utility in case retrieval, whether using imagery or textual queries, thereby advancing knowledge dissemination. The open-source release of CT-RATE and CT-CLIP marks a significant advancement in medical AI, enhancing 3D imaging analysis and fostering innovation in healthcare.
翻译:三维医学影像计算研究面临的主要挑战之一是缺乏全面的数据集。针对这一问题,本研究引入CT-RATE——首个将影像与文本报告配对的三维医学影像数据集。该数据集包含来自21,304名独立患者的25,692个非增强胸部CT容积(通过多种重构扩展至50,188个),同时配有相应的放射学文本报告。基于CT-RATE,我们开发了CT-CLIP——聚焦CT影像的对比语言-影像预训练框架。作为通用型自监督模型,CT-CLIP无需任务特定训练即可广泛应用。值得注意的是,在多异常检测任务的全部关键指标上,CT-CLIP均超越现有最先进的完全监督方法,从而消除了对人工标注的需求。我们还展示了其在病例检索中的应用价值——无论使用影像还是文本查询,均能推动知识传播。CT-RATE与CT-CLIP的开源发布标志着医学人工智能的重要进展,将增强三维影像分析能力并促进医疗保健领域的创新。