Remote sensing images pose distinct challenges for downstream tasks due to their inherent complexity. While a considerable amount of research has been dedicated to remote sensing classification, object detection and semantic segmentation, most of these studies have overlooked the valuable prior knowledge embedded within remote sensing scenarios. Such prior knowledge can be useful because remote sensing objects may be mistakenly recognized without referencing a sufficiently long-range context, which can vary for different objects. This paper considers these priors and proposes a lightweight Large Selective Kernel Network (LSKNet) backbone. LSKNet can dynamically adjust its large spatial receptive field to better model the ranging context of various objects in remote sensing scenarios. To our knowledge, large and selective kernel mechanisms have not been previously explored in remote sensing images. Without bells and whistles, our lightweight LSKNet sets new state-of-the-art scores on standard remote sensing classification, object detection and semantic segmentation benchmarks. Our comprehensive analysis further validated the significance of the identified priors and the effectiveness of LSKNet. The code is available at https://github.com/zcablii/LSKNet.
翻译:遥感图像因其内在复杂性,给下游任务带来独特挑战。尽管已有大量研究致力于遥感图像分类、目标检测与语义分割,但多数工作忽略了遥感场景中蕴含的宝贵先验知识。此类先验知识具有重要意义,因为遥感对象需依赖足够长程的上下文信息才能被准确识别,而不同对象所需的上下文范围差异显著。本文充分考虑这些先验知识,提出一种轻量级的大核选择性网络(LSKNet)骨干架构。LSKNet能够动态调整其大空间感受野,从而更好地建模遥感场景中不同对象的范围性上下文。据我们所知,大核与选择性核机制此前尚未在遥感图像领域得到探索。无需任何额外技巧,我们的轻量级LSKNet便在标准遥感分类、目标检测及语义分割基准上取得了新的最优结果。全面分析进一步验证了所提先验知识的重要性及LSKNet的有效性。代码开源于https://github.com/zcablii/LSKNet。