Self-training with dual uncertainty for semi-supervised medical image segmentation

In the field of semi-supervised medical image segmentation, the shortage of labeled data is the fundamental problem. How to effectively learn image features from unlabeled images to improve segmentation accuracy is the main research direction in this field. Traditional self-training methods can partially solve the problem of insufficient labeled data by generating pseudo labels for iterative training. However, noise generated due to the model's uncertainty during training directly affects the segmentation results. Therefore, we added sample-level and pixel-level uncertainty to stabilize the training process based on the self-training framework. Specifically, we saved several moments of the model during pre-training, and used the difference between their predictions on unlabeled samples as the sample-level uncertainty estimate for that sample. Then, we gradually add unlabeled samples from easy to hard during training. At the same time, we added a decoder with different upsampling methods to the segmentation network and used the difference between the outputs of the two decoders as pixel-level uncertainty. In short, we selectively retrained unlabeled samples and assigned pixel-level uncertainty to pseudo labels to optimize the self-training process. We compared the segmentation results of our model with five semi-supervised approaches on the public 2017 ACDC dataset and 2018 Prostate dataset. Our proposed method achieves better segmentation performance on both datasets under the same settings, demonstrating its effectiveness, robustness, and potential transferability to other medical image segmentation tasks. Keywords: Medical image segmentation, semi-supervised learning, self-training, uncertainty estimation

翻译：在医学图像半监督分割领域，标注数据匮乏是根本问题。如何从未标注图像中有效学习图像特征以提升分割精度，是该领域的主要研究方向。传统自训练方法通过生成伪标签进行迭代训练，可部分解决标注数据不足的问题。然而，训练过程中因模型不确定性产生的噪声会直接影响分割结果。为此，我们基于自训练框架引入样本级与像素级不确定性来稳定训练过程。具体而言，我们在预训练阶段保存模型多个时刻的状态，利用其对未标注样本预测结果的差异作为该样本的样本级不确定性估计。随后，在训练过程中由易到难逐步添加未标注样本。同时，我们在分割网络中增加采用不同上采样方法的解码器，利用两个解码器输出的差异作为像素级不确定性。简言之，我们选择性地对未标注样本进行再训练，并为伪标签赋予像素级不确定性以优化自训练过程。我们在公开的2017 ACDC数据集和2018 Prostate数据集上，将本模型的分割结果与五种半监督方法进行了比较。在相同设置下，本方法在两个数据集上均取得了更优的分割性能，证明了其有效性、鲁棒性，以及向其他医学图像分割任务迁移的潜力。关键词：医学图像分割；半监督学习；自训练；不确定性估计