Semantic segmentation of remote sensing images is essential for various applications, including vegetation monitoring, disaster management, and urban planning. Previous studies have demonstrated that the self-attention mechanism (SA) is an effective approach for designing segmentation networks that can capture long-range pixel dependencies. SA enables the network to model the global dependencies between the input features, resulting in improved segmentation outcomes. However, the high density of attentional feature maps used in this mechanism causes exponential increases in computational complexity. Additionally, it introduces redundant information that negatively impacts the feature representation. Inspired by traditional threshold segmentation algorithms, we propose a novel threshold attention mechanism (TAM). This mechanism significantly reduces computational effort while also better modeling the correlation between different regions of the feature map. Based on TAM, we present a threshold attention network (TANet) for semantic segmentation. TANet consists of an attentional feature enhancement module (AFEM) for global feature enhancement of shallow features and a threshold attention pyramid pooling module (TAPP) for acquiring feature information at different scales for deep features. We have conducted extensive experiments on the ISPRS Vaihingen and Potsdam datasets. The results demonstrate the validity and superiority of our proposed TANet compared to the most state-of-the-art models.
翻译:遥感图像的语义分割对于植被监测、灾害管理和城市规划等多种应用至关重要。先前的研究表明,自注意力机制(SA)是设计能够捕获长距离像素依赖性的分割网络的有效方法。SA使网络能够建模输入特征之间的全局依赖关系,从而改善分割结果。然而,该机制中使用的注意力特征图的高密度导致计算复杂度呈指数级增长。此外,它还引入了对特征表示产生负面影响的冗余信息。受传统阈值分割算法的启发,我们提出了一种新颖的阈值注意力机制(TAM)。该机制显著减少了计算量,同时更好地建模了特征图不同区域之间的相关性。基于TAM,我们提出了一种用于语义分割的阈值注意力网络(TANet)。TANet包含一个用于浅层特征全局特征增强的注意力特征增强模块(AFEM),以及一个用于为深层特征获取不同尺度特征信息的阈值注意力金字塔池化模块(TAPP)。我们在ISPRS Vaihingen和Potsdam数据集上进行了大量实验。结果表明,与最先进的模型相比,我们提出的TANet具有有效性和优越性。