When some application scenarios need to use semantic segmentation technology, like automatic driving, the primary concern comes to real-time performance rather than extremely high segmentation accuracy. To achieve a good trade-off between speed and accuracy, two-branch architecture has been proposed in recent years. It treats spatial information and semantics information separately which allows the model to be composed of two networks both not heavy. However, the process of fusing features with two different scales becomes a performance bottleneck for many nowaday two-branch models. In this research, we design a new fusion mechanism for two-branch architecture which is guided by attention computation. To be precise, we use the Dual-Guided Attention (DGA) module we proposed to replace some multi-scale transformations with the calculation of attention which means we only use several attention layers of near linear complexity to achieve performance comparable to frequently-used multi-layer fusion. To ensure that our module can be effective, we use Residual U-blocks (RSU) to build one of the two branches in our networks which aims to obtain better multi-scale features. Extensive experiments on Cityscapes and CamVid dataset show the effectiveness of our method.
翻译:在某些应用场景(如自动驾驶)需要使用语义分割技术时,首要关注的是实时性能而非极高的分割精度。为实现速度与精度的良好平衡,近年来提出了双分支架构。该架构将空间信息和语义信息分开处理,使得模型由两个轻量级网络组成。然而,对于现今许多双分支模型而言,融合两种不同尺度的特征过程成为性能瓶颈。在本研究中,我们设计了一种由注意力计算引导的新型双分支架构融合机制。具体而言,我们提出的双重引导注意力模块通过注意力计算替代部分多尺度变换,即仅使用若干近线性复杂度的注意力层即可实现与常用多层融合相当的性能。为确保模块有效性,我们利用残差U块构建网络的一个分支,旨在获取更优的多尺度特征。在Cityscapes和CamVid数据集上的大量实验验证了方法的有效性。