DrasCLR: A Self-supervised Framework of Learning Disease-related and Anatomy-specific Representation for 3D Medical Images

Large-scale volumetric medical images with annotation are rare, costly, and time prohibitive to acquire. Self-supervised learning (SSL) offers a promising pre-training and feature extraction solution for many downstream tasks, as it only uses unlabeled data. Recently, SSL methods based on instance discrimination have gained popularity in the medical imaging domain. However, SSL pre-trained encoders may use many clues in the image to discriminate an instance that are not necessarily disease-related. Moreover, pathological patterns are often subtle and heterogeneous, requiring the ability of the desired method to represent anatomy-specific features that are sensitive to abnormal changes in different body parts. In this work, we present a novel SSL framework, named DrasCLR, for 3D medical imaging to overcome these challenges. We propose two domain-specific contrastive learning strategies: one aims to capture subtle disease patterns inside a local anatomical region, and the other aims to represent severe disease patterns that span larger regions. We formulate the encoder using conditional hyper-parameterized network, in which the parameters are dependant on the anatomical location, to extract anatomically sensitive features. Extensive experiments on large-scale computer tomography (CT) datasets of lung images show that our method improves the performance of many downstream prediction and segmentation tasks. The patient-level representation improves the performance of the patient survival prediction task. We show how our method can detect emphysema subtypes via dense prediction. We demonstrate that fine-tuning the pre-trained model can significantly reduce annotation efforts without sacrificing emphysema detection accuracy. Our ablation study highlights the importance of incorporating anatomical context into the SSL framework.

翻译：摘要：带有标注的大规模三维医学图像数据获取成本高昂、耗时且难以实现。自监督学习（SSL）为许多下游任务提供了无需标注数据的预训练与特征提取解决方案。近年来，基于实例辨别的SSL方法在医学影像领域广受欢迎。然而，SSL预训练编码器可能利用图像中与疾病无关的线索来区分实例。此外，病理模式通常具有细微性和异质性，要求目标方法能够表征对不同身体部位异常变化敏感的解剖特定特征。为解决这些挑战，我们提出一种新颖的SSL框架——DrasCLR，专用于三维医学成像。我们提出两种领域特定对比学习策略：一种旨在捕获局部解剖区域内的细微疾病模式，另一种则表征跨越较大区域的严重疾病模式。我们采用条件超参数化网络构建编码器，其参数依赖于解剖位置，以提取解剖敏感特征。在肺部大规模计算机断层扫描（CT）数据集上的广泛实验表明，我们的方法提升了下游预测与分割多项任务的性能。基于患者的表征改善了患者生存预测任务的表现，通过密集预测展示了检测肺气肿亚型的能力。实验证明，微调预训练模型可在不牺牲肺气肿检测精度的前提下显著减少标注工作量。消融研究强调了将解剖上下文融入SSL框架的重要性。