Hi-ResNet: A High-Resolution Remote Sensing Network for Semantic Segmentation

High-resolution remote sensing (HRS) semantic segmentation extracts key objects from high-resolution coverage areas. However, objects of the same category within HRS images generally show significant differences in scale and shape across diverse geographical environments, making it difficult to fit the data distribution. Additionally, a complex background environment causes similar appearances of objects of different categories, which precipitates a substantial number of objects into misclassification as background. These issues make existing learning algorithms sub-optimal. In this work, we solve the above-mentioned problems by proposing a High-resolution remote sensing network (Hi-ResNet) with efficient network structure designs, which consists of a funnel module, a multi-branch module with stacks of information aggregation (IA) blocks, and a feature refinement module, sequentially, and Class-agnostic Edge Aware (CEA) loss. Specifically, we propose a funnel module to downsample, which reduces the computational cost, and extract high-resolution semantic information from the initial input image. Secondly, we downsample the processed feature images into multi-resolution branches incrementally to capture image features at different scales and apply IA blocks, which capture key latent information by leveraging attention mechanisms, for effective feature aggregation, distinguishing image features of the same class with variant scales and shapes. Finally, our feature refinement module integrate the CEA loss function, which disambiguates inter-class objects with similar shapes and increases the data distribution distance for correct predictions. With effective pre-training strategies, we demonstrated the superiority of Hi-ResNet over state-of-the-art methods on three HRS segmentation benchmarks.

翻译：高分辨率遥感（HRS）语义分割从高分辨率覆盖区域中提取关键目标。然而，HRS图像中同一类别的目标在不同地理环境下通常表现出显著的尺度与形状差异，导致数据分布难以拟合。此外，复杂的背景环境使不同类别的目标呈现相似外观，进而引发大量目标被误分类为背景。这些问题导致现有学习算法的性能欠佳。本文通过提出一种具有高效网络结构设计的高分辨率遥感网络（Hi-ResNet）来解决上述问题，该网络依次包含漏斗模块、由多个信息聚合（IA）块堆叠而成的多分支模块、特征细化模块，以及类别无关边缘感知（CEA）损失函数。具体而言，我们提出漏斗模块进行下采样，在降低计算成本的同时从初始输入图像中提取高分辨率语义信息。其次，我们将处理后的特征图像逐步下采样为多分辨率分支，以捕获不同尺度的图像特征，并应用IA块——通过注意力机制捕获关键潜在信息——实现高效特征聚合，从而区分具有不同尺度与形状的同类图像特征。最后，我们的特征细化模块集成CEA损失函数，该函数消除具有相似形状的类间目标歧义，并通过增大数据分布距离实现正确预测。借助有效的预训练策略，我们在三个HRS分割基准上证明了Hi-ResNet相较于现有最优方法的优越性。