Volume-wise labeling in 3D medical images is a time-consuming task that requires expertise. As a result, there is growing interest in using semi-supervised learning (SSL) techniques to train models with limited labeled data. However, the challenges and practical applications extend beyond SSL to settings such as unsupervised domain adaptation (UDA) and semi-supervised domain generalization (SemiDG). This work aims to develop a generic SSL framework that can handle all three settings. We identify two main obstacles to achieving this goal in the existing SSL framework: 1) the weakness of capturing distribution-invariant features; and 2) the tendency for unlabeled data to be overwhelmed by labeled data, leading to over-fitting to the labeled data during training. To address these issues, we propose an Aggregating & Decoupling framework. The aggregating part consists of a Diffusion encoder that constructs a common knowledge set by extracting distribution-invariant features from aggregated information from multiple distributions/domains. The decoupling part consists of three decoders that decouple the training process with labeled and unlabeled data, thus avoiding over-fitting to labeled data, specific domains and classes. We evaluate our proposed framework on four benchmark datasets for SSL, Class-imbalanced SSL, UDA and SemiDG. The results showcase notable improvements compared to state-of-the-art methods across all four settings, indicating the potential of our framework to tackle more challenging SSL scenarios. Code and models are available at: https://github.com/xmed-lab/GenericSSL.
翻译:在三维医学图像中逐体素标注是一项耗时且需要专业知识的工作。因此,利用半监督学习(SSL)技术用有限标注数据训练模型的方法日益受到关注。然而,其挑战与实际应用已超越半监督学习范畴,延伸至无监督域自适应(UDA)和半监督域泛化(SemiDG)等场景。本研究旨在开发一种能同时处理以上三种场景的通用半监督学习框架。我们发现现有半监督学习框架中存在两大障碍:1)难以捕获分布不变特征;2)未标注数据易被标注数据主导,导致训练过程中对标注数据过拟合。针对这些问题,我们提出聚合-解耦框架。聚合部分包含扩散编码器,通过从多分布/域聚合信息中提取分布不变特征,构建公共知识集;解耦部分包含三个解码器,通过解耦标注数据与未标注数据的训练过程,避免对标注数据、特定域和类别过拟合。我们在四个基准数据集上评估了所提框架在半监督学习、类别不平衡半监督学习、无监督域自适应和半监督域泛化任务中的性能。结果显示,该方法在所有四种场景下均显著优于现有最优方法,表明该框架具有应对更具挑战性半监督学习场景的潜力。代码与模型已开源:https://github.com/xmed-lab/GenericSSL。