Remote sensing images are known of having complex backgrounds, high intra-class variance and large variation of scales, which bring challenge to semantic segmentation. We present LoG-CAN, a multi-scale semantic segmentation network with a global class-aware (GCA) module and local class-aware (LCA) modules to remote sensing images. Specifically, the GCA module captures the global representations of class-wise context modeling to circumvent background interference; the LCA modules generate local class representations as intermediate aware elements, indirectly associating pixels with global class representations to reduce variance within a class; and a multi-scale architecture with GCA and LCA modules yields effective segmentation of objects at different scales via cascaded refinement and fusion of features. Through the evaluation on the ISPRS Vaihingen dataset and the ISPRS Potsdam dataset, experimental results indicate that LoG-CAN outperforms the state-of-the-art methods for general semantic segmentation, while significantly reducing network parameters and computation. Code is available at~\href{https://github.com/xwmaxwma/rssegmentation}{https://github.com/xwmaxwma/rssegmentation}.
翻译:遥感图像具有背景复杂、类内方差大以及尺度变化显著等特点,给语义分割带来了挑战。本文提出LoG-CAN——一种面向遥感图像的多尺度语义分割网络,该网络包含全局类别感知(GCA)模块与局部类别感知(LCA)模块。具体而言,GCA模块通过全局类别上下文建模捕获类别级表征以消除背景干扰;LCA模块生成局部类别表征作为中间感知单元,间接建立像素与全局类别表征的关联以降低类内方差;而融合GCA与LCA模块的多尺度架构,通过级联精化与特征融合实现了对不同尺度目标的有效分割。在ISPRS Vaihingen数据集与ISPRS Potsdam数据集上的评估结果表明,LoG-CAN在显著减少网络参数量和计算量的同时,其性能超越当前通用的语义分割方法。代码已发布于:\href{https://github.com/xwmaxwma/rssegmentation}{https://github.com/xwmaxwma/rssegmentation}。