Augmentation is AUtO-Net: Augmentation-Driven Contrastive Multiview Learning for Medical Image Segmentation

The utilisation of deep learning segmentation algorithms that learn complex organs and tissue patterns and extract essential regions of interest from the noisy background to improve the visual ability for medical image diagnosis has achieved impressive results in Medical Image Computing (MIC). This thesis focuses on retinal blood vessel segmentation tasks, providing an extensive literature review of deep learning-based medical image segmentation approaches while comparing the methodologies and empirical performances. The work also examines the limitations of current state-of-the-art methods by pointing out the two significant existing limitations: data size constraints and the dependency on high computational resources. To address such problems, this work proposes a novel efficient, simple multiview learning framework that contrastively learns invariant vessel feature representation by comparing with multiple augmented views by various transformations to overcome data shortage and improve generalisation ability. Moreover, the hybrid network architecture integrates the attention mechanism into a Convolutional Neural Network to further capture complex continuous curvilinear vessel structures. The result demonstrates the proposed method validated on the CHASE-DB1 dataset, attaining the highest F1 score of 83.46% and the highest Intersection over Union (IOU) score of 71.62% with UNet structure, surpassing existing benchmark UNet-based methods by 1.95% and 2.8%, respectively. The combination of the metrics indicates the model detects the vessel object accurately with a highly coincidental location with the ground truth. Moreover, the proposed approach could be trained within 30 minutes by consuming less than 3 GB GPU RAM, and such characteristics support the efficient implementation for real-world applications and deployments.

翻译：深度学习分割算法能够学习复杂器官和组织模式，从噪声背景中提取关键感兴趣区域，从而提升医学影像诊断的视觉能力。该方法在医学图像计算（MIC）领域已取得显著成果。本论文聚焦视网膜血管分割任务，对基于深度学习的医学图像分割方法进行了广泛的文献综述，同时比较了各方法的技术路线与实证性能。研究还通过指出现有顶尖方法的两大显著局限性——数据规模限制与高计算资源依赖，揭示了当前技术的不足。为解决上述问题，本研究提出一种新型高效、简洁的多视图学习框架，通过对比多种变换生成的增强视图，以对比学习方式学习不变血管特征表征，从而克服数据短缺问题并提升泛化能力。此外，混合网络架构将注意力机制融入卷积神经网络，进一步捕捉复杂连续的曲线形血管结构。在CHASE-DB1数据集上的验证结果表明，所提方法在UNet结构下取得了最高F1分数83.46%和最高交并比（IOU）分数71.62%，分别超越现有基于UNet的基准方法1.95%和2.8%。指标组合表明模型能够准确检测血管目标，且预测位置与真实值高度重合。此外，所提方法可在30分钟内完成训练，GPU RAM消耗低于3GB，这一特性支持其在真实应用场景中的高效部署与实施。