High resolution is crucial for precise segmentation in fundus images, yet handling high-resolution inputs incurs considerable GPU memory costs, with diminishing performance gains as overhead increases. To address this issue while tackling the challenge of segmenting tiny objects, recent studies have explored local-global fusion methods. These methods preserve fine details using local regions and capture long-range context information from downscaled global images. However, the necessity of multiple forward passes inevitably incurs significant computational overhead, adversely affecting inference speed. In this paper, we propose HRDecoder, a simple High-Resolution Decoder network for fundus lesion segmentation. It integrates a high-resolution representation learning module to capture fine-grained local features and a high-resolution fusion module to fuse multi-scale predictions. Our method effectively improves the overall segmentation accuracy of fundus lesions while consuming reasonable memory and computational overhead, and maintaining satisfying inference speed. Experimental results on the IDRID and DDR datasets demonstrate the effectiveness of our method. Code is available at https://github.com/CVIU-CSU/HRDecoder.
翻译:高分辨率对于眼底图像的精确分割至关重要,然而处理高分辨率输入会带来可观的GPU内存开销,且随着开销增加,性能提升逐渐减小。为解决此问题并应对微小物体分割的挑战,近期研究探索了局部-全局融合方法。这些方法利用局部区域保留精细细节,并从下采样的全局图像中捕获长程上下文信息。然而,多次前向传播的必要性不可避免地带来显著的计算开销,对推理速度产生不利影响。本文提出HRDecoder,一种用于眼底病灶分割的简单高分辨率解码器网络。它集成了一个高分辨率表征学习模块以捕获细粒度局部特征,以及一个高分辨率融合模块以融合多尺度预测。我们的方法在消耗合理内存和计算开销、保持满意推理速度的同时,有效提升了眼底病灶的整体分割精度。在IDRID和DDR数据集上的实验结果证明了我们方法的有效性。代码发布于 https://github.com/CVIU-CSU/HRDecoder。