Self-training with dual uncertainty for semi-supervised medical image segmentation

In the field of semi-supervised medical image segmentation, the shortage of labeled data is the fundamental problem. How to effectively learn image features from unlabeled images to improve segmentation accuracy is the main research direction in this field. Traditional self-training methods can partially solve the problem of insufficient labeled data by generating pseudo labels for iterative training. However, noise generated due to the model's uncertainty during training directly affects the segmentation results. Therefore, we added sample-level and pixel-level uncertainty to stabilize the training process based on the self-training framework. Specifically, we saved several moments of the model during pre-training, and used the difference between their predictions on unlabeled samples as the sample-level uncertainty estimate for that sample. Then, we gradually add unlabeled samples from easy to hard during training. At the same time, we added a decoder with different upsampling methods to the segmentation network and used the difference between the outputs of the two decoders as pixel-level uncertainty. In short, we selectively retrained unlabeled samples and assigned pixel-level uncertainty to pseudo labels to optimize the self-training process. We compared the segmentation results of our model with five semi-supervised approaches on the public 2017 ACDC dataset and 2018 Prostate dataset. Our proposed method achieves better segmentation performance on both datasets under the same settings, demonstrating its effectiveness, robustness, and potential transferability to other medical image segmentation tasks. Keywords: Medical image segmentation, semi-supervised learning, self-training, uncertainty estimation

翻译：在半监督医学图像分割领域，标注数据的匮乏是根本性问题。如何从未标注图像中有效学习图像特征以提高分割精度，是该领域的主要研究方向。传统的自训练方法通过生成伪标签进行迭代训练，可以部分解决标注数据不足的问题。然而，训练过程中由于模型不确定性产生的噪声直接影响分割结果。为此，我们在自训练框架基础上引入样本级和像素级不确定性来稳定训练过程。具体而言，我们在预训练阶段保存模型的多个时刻，利用它们对未标注样本预测结果的差异作为该样本的样本级不确定性估计。随后在训练过程中由易到难逐步加入未标注样本。同时，我们在分割网络中增加一个采用不同上采样方法的解码器，将两个解码器输出的差异作为像素级不确定性。简言之，我们选择性重训练未标注样本，并为伪标签分配像素级不确定性以优化自训练过程。我们在公开的2017年ACDC数据集和2018年前列腺数据集上将模型分割结果与五种半监督方法进行比较。在相同设置下，我们提出的方法在两个数据集上均取得了更优的分割性能，证明了其有效性、鲁棒性以及向其他医学图像分割任务迁移的潜力。关键词：医学图像分割，半监督学习，自训练，不确定性估计