We propose a novel continual self-supervised learning (CSSL) framework for simultaneously learning diverse features from multi-window-obtained chest computed tomography (CT) images and ensuring data privacy. Achieving a robust and highly generalizable model in medical image diagnosis is challenging, mainly because of issues, such as the scarcity of large-scale, accurately annotated datasets and domain shifts inherent to dynamic healthcare environments. Specifically, in chest CT, these domain shifts often arise from differences in window settings, which are optimized for distinct clinical purposes. Previous CSSL frameworks often mitigated domain shift by reusing past data, a typically impractical approach owing to privacy constraints. Our approach addresses these challenges by effectively capturing the relationship between previously learned knowledge and new information across different training stages through continual pretraining on unlabeled images. Specifically, by incorporating a latent replay-based mechanism into CSSL, our method mitigates catastrophic forgetting due to domain shifts during continual pretraining while ensuring data privacy. Additionally, we introduce a feature distillation technique that integrates Wasserstein distance-based knowledge distillation (WKD) and batch-knowledge ensemble (BKE), enhancing the ability of the model to learn meaningful, domain-shift-robust representations. Finally, we validate our approach using chest CT images obtained across two different window settings, demonstrating superior performance compared with other approaches.
翻译:我们提出了一种新颖的持续自监督学习框架,用于同时从多窗位获取的胸部计算机断层扫描图像中学习多样化特征,并确保数据隐私。在医学影像诊断中构建鲁棒且高度可泛化的模型具有挑战性,主要源于大规模精确标注数据集的稀缺性以及动态医疗环境中固有的领域偏移问题。具体而言,在胸部CT中,这些领域偏移常由针对不同临床目的优化的窗位设置差异引起。以往的持续自监督学习框架通常通过复用历史数据来缓解领域偏移,但由于隐私限制,这种方法往往不切实际。我们的方法通过在未标注图像上进行持续预训练,有效捕捉不同训练阶段中已有知识与新信息之间的关联,从而应对这些挑战。具体而言,通过将基于潜在重放的机制融入持续自监督学习,我们的方法在确保数据隐私的同时,缓解了持续预训练中因领域偏移导致的灾难性遗忘。此外,我们引入了一种特征蒸馏技术,该技术整合了基于Wasserstein距离的知识蒸馏与批次知识集成,增强了模型学习具有意义且对领域偏移鲁棒的表征能力。最后,我们使用两种不同窗位设置获取的胸部CT图像验证了所提方法,结果表明其性能优于其他现有方法。