Convolutional neural networks (CNNs) are a promising technique for automated glaucoma diagnosis from images of the fundus, and these images are routinely acquired as part of an ophthalmic exam. Nevertheless, CNNs typically require a large amount of well-labeled data for training, which may not be available in many biomedical image classification applications, especially when diseases are rare and where labeling by experts is costly. This article makes two contributions to address this issue: (1) It extends the conventional Siamese network and introduces a training method for low-shot learning when labeled data are limited and imbalanced, and (2) it introduces a novel semi-supervised learning strategy that uses additional unlabeled training data to achieve greater accuracy. Our proposed multi-task Siamese network (MTSN) can employ any backbone CNN, and we demonstrate with four backbone CNNs that its accuracy with limited training data approaches the accuracy of backbone CNNs trained with a dataset that is 50 times larger. We also introduce One-Vote Veto (OVV) self-training, a semi-supervised learning strategy that is designed specifically for MTSNs. By taking both self-predictions and contrastive predictions of the unlabeled training data into account, OVV self-training provides additional pseudo labels for fine-tuning a pre-trained MTSN. Using a large (imbalanced) dataset with 66,715 fundus photographs acquired over 15 years, extensive experimental results demonstrate the effectiveness of low-shot learning with MTSN and semi-supervised learning with OVV self-training. Three additional, smaller clinical datasets of fundus images acquired under different conditions (cameras, instruments, locations, populations) are used to demonstrate the generalizability of the proposed methods.
翻译:卷积神经网络(CNN)是从眼底图像进行自动青光眼诊断的一种有前景的技术,而这些图像作为眼科检查的常规部分被采集。然而,CNN通常需要大量标注良好的训练数据,这在许多生物医学图像分类应用中可能无法获得,尤其是在疾病罕见且专家标注成本高昂的情况下。本文为解决这一问题做出了两项贡献:(1)扩展了传统的孪生网络,并提出了一种在标注数据有限且不平衡时用于小样本学习的训练方法;(2)引入了一种新颖的半监督学习策略,利用额外的无标注训练数据以实现更高的准确率。我们提出的多任务孪生网络(MTSN)可以使用任何骨干CNN,并通过四种骨干CNN证明,其在使用有限训练数据时的准确率接近使用50倍数据集训练的骨干CNN的准确率。我们还提出了一票否决(OVV)自训练,这是一种专为MTSN设计的半监督学习策略。通过同时考虑无标注训练数据的自预测和对比预测,OVV自训练提供了额外的伪标签,用于微调预训练的MTSN。使用一个包含66,715张跨越15年采集的眼底照片的大型(不平衡)数据集,广泛的实验结果证明了MTSN小样本学习和OVV自训练半监督学习的有效性。另外三个在不同条件(相机、仪器、地点、人群)下采集的较小临床眼底图像数据集被用于展示所提方法的泛化能力。