Identifying high-quality and easily accessible annotated samples poses a notable challenge in medical image analysis. Transfer learning techniques, leveraging pre-training data, offer a flexible solution to this issue. However, the impact of fine-tuning diminishes when the dataset exhibits an irregular distribution between classes. This paper introduces a novel deep convolutional neural network, named Curriculum Learning and Progressive Self-supervised Training (CURVETE). CURVETE addresses challenges related to limited samples, enhances model generalisability, and improves overall classification performance. It achieves this by employing a curriculum learning strategy based on the granularity of sample decomposition during the training of generic unlabelled samples. Moreover, CURVETE address the challenge of irregular class distribution by incorporating a class decomposition approach in the downstream task. The proposed method undergoes evaluation on three distinct medical image datasets: brain tumour, digital knee x-ray, and Mini-DDSM datasets. We investigate the classification performance using a generic self-supervised sample decomposition approach with and without the curriculum learning component in training the pretext task. Experimental results demonstrate that the CURVETE model achieves superior performance on test sets with an accuracy of 96.60% on the brain tumour dataset, 75.60% on the digital knee x-ray dataset, and 93.35% on the Mini-DDSM dataset using the baseline ResNet-50. Furthermore, with the baseline DenseNet-121, it achieved accuracies of 95.77%, 80.36%, and 93.22% on the brain tumour, digital knee x-ray, and Mini-DDSM datasets, respectively, outperforming other training strategies.
翻译:在医学图像分析中,获取高质量且易于获取的标注样本是一项显著挑战。利用预训练数据的迁移学习技术为此问题提供了灵活的解决方案。然而,当数据集中类别间呈现不规则分布时,微调的效果会减弱。本文提出了一种新颖的深度卷积神经网络,命名为课程学习与渐进式自监督训练(CURVETE)。CURVETE通过采用基于样本分解粒度的课程学习策略,在训练通用无标签样本时,应对样本有限的挑战,增强模型的泛化能力,并提升整体分类性能。此外,CURVETE通过在下游任务中引入类别分解方法,解决了类别分布不规则的问题。所提方法在三个不同的医学图像数据集上进行了评估:脑肿瘤数据集、数字膝关节X射线数据集和Mini-DDSM数据集。我们研究了在预训练任务训练中,使用通用自监督样本分解方法时,包含与不包含课程学习组件对分类性能的影响。实验结果表明,基于基线ResNet-50的CURVETE模型在测试集上取得了优越的性能,在脑肿瘤数据集上的准确率为96.60%,在数字膝关节X射线数据集上为75.60%,在Mini-DDSM数据集上为93.35%。此外,基于基线DenseNet-121的CURVETE模型在脑肿瘤、数字膝关节X射线和Mini-DDSM数据集上分别达到了95.77%、80.36%和93.22%的准确率,其性能优于其他训练策略。