Remote sensing images usually characterized by complex backgrounds, scale and orientation variations, and large intra-class variance. General semantic segmentation methods usually fail to fully investigate the above issues, and thus their performances on remote sensing image segmentation are limited. In this paper, we propose our LOGCAN++, a semantic segmentation model customized for remote sensing images, which is made up of a Global Class Awareness (GCA) module and several Local Class Awareness (LCA) modules. The GCA module captures global representations for class-level context modeling to reduce the interference of background noise. The LCA module generates local class representations as intermediate perceptual elements to indirectly associate pixels with the global class representations, targeting at dealing with the large intra-class variance problem. In particular, we introduce affine transformations in the LCA module for adaptive extraction of local class representations to effectively tolerate scale and orientation variations in remotely sensed images. Extensive experiments on three benchmark datasets show that our LOGCAN++ outperforms current mainstream general and remote sensing semantic segmentation methods and achieves a better trade-off between speed and accuracy. Code is available at https://github.com/xwmaxwma/rssegmentation.
翻译:遥感图像通常具有背景复杂、尺度与方向变化显著以及类内差异大等特点。通用语义分割方法往往未能充分探究上述问题,因此在遥感图像分割任务上的性能受限。本文提出LOGCAN++,一种专为遥感图像定制的语义分割模型,其由全局类感知模块与多个局部类感知模块构成。全局类感知模块通过捕获全局表征进行类级上下文建模,以降低背景噪声干扰;局部类感知模块则生成局部类表征作为中间感知单元,间接建立像素与全局类表征的关联,旨在处理类内差异大的问题。特别地,我们在局部类感知模块中引入仿射变换,以实现局部类表征的自适应提取,从而有效适应遥感图像中的尺度与方向变化。在三个基准数据集上的大量实验表明,LOGCAN++优于当前主流的通用及遥感专用语义分割方法,并在速度与精度间取得了更优的平衡。代码公开于https://github.com/xwmaxwma/rssegmentation。