Polyp segmentation is crucial for preventing colorectal cancer a common type of cancer. Deep learning has been used to segment polyps automatically, which reduces the risk of misdiagnosis. Localizing small polyps in colonoscopy images is challenging because of its complex characteristics, such as color, occlusion, and various shapes of polyps. To address this challenge, a novel frequency-based fully convolutional neural network, Multi-Frequency Feature Fusion Polyp Segmentation Network (M3FPolypSegNet) was proposed to decompose the input image into low/high/full-frequency components to use the characteristics of each component. We used three independent multi-frequency encoders to map multiple input images into a high-dimensional feature space. In the Frequency-ASPP Scalable Attention Module (F-ASPP SAM), ASPP was applied between each frequency component to preserve scale information. Subsequently, scalable attention was applied to emphasize polyp regions in a high-dimensional feature space. Finally, we designed three multi-task learning (i.e., region, edge, and distance) in four decoder blocks to learn the structural characteristics of the region. The proposed model outperformed various segmentation models with performance gains of 6.92% and 7.52% on average for all metrics on CVC-ClinicDB and BKAI-IGH-NeoPolyp, respectively.
翻译:息肉分割对于预防结直肠癌(一种常见癌症)至关重要。深度学习已用于实现息肉的自动分割,从而降低误诊风险。由于息肉具有颜色、遮挡及多形态等复杂特征,在结肠镜图像中定位小型息肉极具挑战性。为应对这一难题,本文提出一种基于频率的新型全卷积神经网络——多频特征融合息肉分割网络(M3FPolypSegNet),该网络将输入图像分解为低/高/全频分量,以充分利用各分量的特性。我们采用三个独立的多频编码器将多输入图像映射至高维特征空间。在频率自适应空洞空间金字塔池化可扩展注意力模块(F-ASPP SAM)中,我们在各频率分量间应用ASPP以保留尺度信息,进而采用可扩展注意力机制在高维特征空间中强化息肉区域。最后,我们在四个解码器模块中设计了三项多任务学习(即区域、边缘和距离),以学习区域的形态结构特征。在CVC-ClinicDB和BKAI-IGH-NeoPolyp数据集上,所提模型在所有指标上的平均性能分别超越各类分割模型6.92%和7.52%。