Semi-supervised learning (SSL) has emerged as a critical paradigm for medical image segmentation, mitigating the immense cost of dense annotations. However, prevailing SSL frameworks are fundamentally "inward-looking", recycling information and biases solely from within the target dataset. This design triggers a vicious cycle of confirmation bias under class imbalance, leading to the catastrophic failure to recognize minority classes. To dismantle this systemic issue, we propose a paradigm shift to a multi-level "outward-looking" framework. Our primary innovation is Foundational Knowledge Distillation (FKD), which looks outward beyond the confines of medical imaging by introducing a pre-trained visual foundation model, DINOv3, as an unbiased external semantic teacher. Instead of trusting the student's biased high confidence, our method distills knowledge from DINOv3's robust understanding of high semantic uniqueness, providing a stable, cross-domain supervisory signal that anchors the learning of minority classes. To complement this core strategy, we further look outward within the data by proposing Progressive Imbalance-aware CutMix (PIC), which creates a dynamic curriculum that adaptively forces the model to focus on minority classes in both labeled and unlabeled subsets. This layered strategy forms our framework, DINO-Mix, which breaks the vicious cycle of bias and achieves remarkable performance on challenging semi-supervised class-imbalanced medical image segmentation benchmarks Synapse and AMOS.
翻译:半监督学习已成为医学图像分割的关键范式,旨在缓解密集标注的巨大成本。然而,现有的半监督学习框架本质上是“向内看”的,仅从目标数据集内部回收信息和偏差。这种设计在类别不平衡条件下引发了确认偏差的恶性循环,导致无法识别少数类别的灾难性失败。为打破这一系统性难题,我们提出向多层次“向外看”框架的范式转变。我们的核心创新是基础知识蒸馏,它通过引入预训练的视觉基础模型DINOv3作为无偏的外部语义教师,将视野扩展到医学成像领域之外。我们的方法不依赖学生有偏差的高置信度,而是从DINOv3对高语义独特性的鲁棒理解中蒸馏知识,提供稳定的跨域监督信号,从而锚定少数类别的学习。为补充这一核心策略,我们进一步在数据层面“向外看”,提出了渐进式不平衡感知CutMix,它创建了一个动态课程,自适应地迫使模型在标注和未标注子集中都关注少数类别。这种分层策略构成了我们的框架DINO-Mix,它打破了偏差的恶性循环,并在具有挑战性的半监督类别不平衡医学图像分割基准数据集Synapse和AMOS上取得了显著性能。