Recent real-time semantic segmentation models, whether single-branch or multi-branch, achieve good performance and speed. However, their speed is limited by multi-path blocks, and some depend on high-performance teacher models for training. To overcome these issues, we propose Golden Cudgel Network (GCNet). Specifically, GCNet uses vertical multi-convolutions and horizontal multi-paths for training, which are reparameterized into a single convolution for inference, optimizing both performance and speed. This design allows GCNet to self-enlarge during training and self-contract during inference, effectively becoming a "teacher model" without needing external ones. Experimental results show that GCNet outperforms existing state-of-the-art models in terms of performance and speed on the Cityscapes, CamVid, and Pascal VOC 2012 datasets. The code is available at https://github.com/gyyang23/GCNet.
翻译:近年来,无论是单分支还是多分支的实时语义分割模型,在性能和速度方面都取得了良好的表现。然而,其速度受限于多路径模块,且部分模型训练时依赖于高性能教师模型。为克服这些问题,我们提出了金箍棒网络(GCNet)。具体而言,GCNet在训练时采用垂直多卷积与水平多路径结构,并在推理时通过重参数化技术将其融合为单一卷积,从而同时优化性能与速度。该设计使GCNet在训练时能够自我扩展,在推理时自我收缩,无需外部模型即可有效充当“教师模型”。实验结果表明,在Cityscapes、CamVid和Pascal VOC 2012数据集上,GCNet在性能与速度方面均优于现有的先进模型。代码已发布于https://github.com/gyyang23/GCNet。