Semantic segmentation plays a key role in applications such as autonomous driving and medical image. Although existing real-time semantic segmentation models achieve a commendable balance between accuracy and speed, their multi-path blocks still affect overall speed. To address this issue, this study proposes a Reparameterizable Dual-Resolution Network (RDRNet) dedicated to real-time semantic segmentation. Specifically, RDRNet employs a two-branch architecture, utilizing multi-path blocks during training and reparameterizing them into single-path blocks during inference, thereby enhancing both accuracy and inference speed simultaneously. Furthermore, we propose the Reparameterizable Pyramid Pooling Module (RPPM) to enhance the feature representation of the pyramid pooling module without increasing its inference time. Experimental results on the Cityscapes, CamVid, and Pascal VOC 2012 datasets demonstrate that RDRNet outperforms existing state-of-the-art models in terms of both performance and speed. The code is available at https://github.com/gyyang23/RDRNet.
翻译:语义分割在自动驾驶和医学影像等应用中起着关键作用。尽管现有的实时语义分割模型在精度与速度之间取得了值得称赞的平衡,但其多路径模块仍会影响整体速度。为解决此问题,本研究提出了一种专用于实时语义分割的可重参数化双分辨率网络(RDRNet)。具体而言,RDRNet采用双分支架构,在训练时使用多路径模块,并在推理时将其重参数化为单路径模块,从而同时提升精度与推理速度。此外,我们提出了可重参数化金字塔池化模块(RPPM),在不增加推理时间的情况下增强金字塔池化模块的特征表示能力。在Cityscapes、CamVid和Pascal VOC 2012数据集上的实验结果表明,RDRNet在性能和速度方面均优于现有的最先进模型。代码发布于 https://github.com/gyyang23/RDRNet。