This work introduces EffiSegNet, a novel segmentation framework leveraging transfer learning with a pre-trained Convolutional Neural Network (CNN) classifier as its backbone. Deviating from traditional architectures with a symmetric U-shape, EffiSegNet simplifies the decoder and utilizes full-scale feature fusion to minimize computational cost and the number of parameters. We evaluated our model on the gastrointestinal polyp segmentation task using the publicly available Kvasir-SEG dataset, achieving state-of-the-art results. Specifically, the EffiSegNet-B4 network variant achieved an F1 score of 0.9552, mean Dice (mDice) 0.9483, mean Intersection over Union (mIoU) 0.9056, Precision 0.9679, and Recall 0.9429 with a pre-trained backbone - to the best of our knowledge, the highest reported scores in the literature for this dataset. Additional training from scratch also demonstrated exceptional performance compared to previous work, achieving an F1 score of 0.9286, mDice 0.9207, mIoU 0.8668, Precision 0.9311 and Recall 0.9262. These results underscore the importance of a well-designed encoder in image segmentation networks and the effectiveness of transfer learning approaches.
翻译:本文提出EffiSegNet,一种利用迁移学习的新型分割框架,其以预训练的卷积神经网络(CNN)分类器作为骨干网络。与传统对称U型架构不同,EffiSegNet通过简化解码器并采用全尺度特征融合,显著降低了计算成本与参数量。我们在公开的Kvasir-SEG数据集上对模型进行胃肠道息肉分割任务评估,取得了最先进的性能。具体而言,采用预训练骨干的EffiSegNet-B4变体获得了F1分数0.9552、平均Dice系数(mDice)0.9483、平均交并比(mIoU)0.9056、精确率0.9679与召回率0.9429——据我们所知,这是该数据集上文献报道的最高分数。从头开始训练的模型同样展现出卓越性能,取得了F1分数0.9286、mDice 0.9207、mIoU 0.8668、精确率0.9311与召回率0.9262。这些结果印证了精心设计的编码器在图像分割网络中的重要性,以及迁移学习方法的有效性。