Accurately segmenting blood vessels in retinal fundus images is crucial in the early screening, diagnosing, and evaluating some ocular diseases, yet it poses a nontrivial uncertainty for the segmentation task due to various factors such as significant light variations, uneven curvilinear structures, and non-uniform contrast. As a result, a multiple attention-guided fusion network (MAF-Net) is proposed to accurately detect blood vessels in retinal fundus images. Currently, traditional UNet-based models may lose partial information due to explicitly modeling long-distance dependencies, which may lead to unsatisfactory results. To enrich contextual information for the loss of scene information compensation, an attention fusion mechanism that combines the channel attention with spatial attention mechanisms constructed by Transformer is employed to extract various features of blood vessels from retinal fundus images. Subsequently, a unique spatial attention mechanism is applied in the skip connection to filter out redundant information and noise from low-level features, thus enabling better integration with high-level features. In addition, a DropOut layer is employed to randomly discard some neurons, which can prevent overfitting of the deep learning network and improve its generalization performance. Experimental results were verified in public datasets DRIVE, STARE and CHASEDB1 with F1 scores of 0.818, 0.836 and 0.811, and Acc values of 0.968, 0.973 and 0.973, respectively. Both visual inspection and quantitative evaluation demonstrate that our method produces satisfactory results compared to some state-of-the-art methods.
翻译:准确分割视网膜眼底图像中的血管对于某些眼病的早期筛查、诊断和评估至关重要,但由于光照变化显著、曲线状结构不均匀以及对比度不一致等多种因素,该分割任务存在显著的不确定性。为此,提出了一种多注意力引导融合网络(MAF-Net),用于精确检测视网膜眼底图像中的血管。当前,传统的基于UNet的模型因显式建模长距离依赖关系而可能丢失部分信息,导致结果不理想。为丰富上下文信息以补偿场景信息损失,采用了一种结合通道注意力与基于Transformer构建的空间注意力机制的融合机制,从视网膜眼底图像中提取血管的多种特征。随后,在跳跃连接中应用独特的空间注意力机制,过滤低层特征中的冗余信息和噪声,从而更好地与高层特征融合。此外,利用DropOut层随机丢弃部分神经元,可防止深度学习网络过拟合并提升其泛化性能。实验结果在公开数据集DRIVE、STARE和CHASEDB1上验证,F1分数分别为0.818、0.836和0.811,准确率(Acc)分别为0.968、0.973和0.973。视觉检测与定量评估均表明,与一些最先进的方法相比,我们的方法取得了令人满意的结果。