Purpose: In curriculum learning, the idea is to train on easier samples first and gradually increase the difficulty, while in self-paced learning, a pacing function defines the speed to adapt the training progress. While both methods heavily rely on the ability to score the difficulty of data samples, an optimal scoring function is still under exploration. Methodology: Distillation is a knowledge transfer approach where a teacher network guides a student network by feeding a sequence of random samples. We argue that guiding student networks with an efficient curriculum strategy can improve model generalization and robustness. For this purpose, we design an uncertainty-based paced curriculum learning in self distillation for medical image segmentation. We fuse the prediction uncertainty and annotation boundary uncertainty to develop a novel paced-curriculum distillation (PCD). We utilize the teacher model to obtain prediction uncertainty and spatially varying label smoothing with Gaussian kernel to generate segmentation boundary uncertainty from the annotation. We also investigate the robustness of our method by applying various types and severity of image perturbation and corruption. Results: The proposed technique is validated on two medical datasets of breast ultrasound image segmentation and robotassisted surgical scene segmentation and achieved significantly better performance in terms of segmentation and robustness. Conclusion: P-CD improves the performance and obtains better generalization and robustness over the dataset shift. While curriculum learning requires extensive tuning of hyper-parameters for pacing function, the level of performance improvement suppresses this limitation.
翻译:目的:在课程学习中,核心理念是先从简单样本训练,逐步增加难度;而在自步学习中,节奏函数定义了适应训练进度的速度。这两种方法都高度依赖于对数据样本难度的评分能力,但最优的评分函数仍在探索中。方法:蒸馏是一种知识迁移方法,其中教师网络通过输入随机样本序列来指导学生网络。我们认为,采用高效的课程策略指导学生网络能够提升模型的泛化能力和鲁棒性。为此,我们在医学图像分割的自蒸馏中设计了一种基于不确定性的渐进课程学习。我们融合预测不确定性与标注边界不确定性,提出了一种新颖的渐进课程蒸馏(PCD)。利用教师模型获取预测不确定性,并采用高斯核的空间变标签平滑技术从标注中生成分割边界不确定性。我们还通过施加不同类型和严重程度的图像扰动与退化,研究了方法的鲁棒性。结果:该方法在乳腺超声图像分割和机器人辅助手术场景分割两个医学数据集上进行了验证,在分割精度和鲁棒性方面均取得了显著更优的性能。结论:渐进课程蒸馏提升了性能,并在数据集偏移下获得了更好的泛化性与鲁棒性。尽管课程学习需要大量调整节奏函数的超参数,但性能提升的幅度足以弥补这一局限。