In remote sensing imagery analysis, patch-based methods have limitations in capturing information beyond the sliding window. This shortcoming poses a significant challenge in processing complex and variable geo-objects, which results in semantic inconsistency in segmentation results. To address this challenge, we propose a dynamic scale perception framework, named GeoAgent, which adaptively captures appropriate scale context information outside the image patch based on the different geo-objects. In GeoAgent, each image patch's states are represented by a global thumbnail and a location mask. The global thumbnail provides context beyond the patch, and the location mask guides the perceived spatial relationships. The scale-selection actions are performed through a Scale Control Agent (SCA). A feature indexing module is proposed to enhance the ability of the agent to distinguish the current image patch's location. The action switches the patch scale and context branch of a dual-branch segmentation network that extracts and fuses the features of multi-scale patches. The GeoAgent adjusts the network parameters to perform the appropriate scale-selection action based on the reward received for the selected scale. The experimental results, using two publicly available datasets and our newly constructed dataset WUSU, demonstrate that GeoAgent outperforms previous segmentation methods, particularly for large-scale mapping applications.
翻译:在遥感影像分析中,基于图像块的方法在捕捉滑动窗口之外的上下文信息方面存在局限性。这一缺陷在处理复杂多变的地物目标时带来显著挑战,导致分割结果出现语义不一致。为解决该问题,我们提出一种动态尺度感知框架——GeoAgent,它能够根据地物目标的不同,自适应地捕捉图像块外合适的尺度上下文信息。在GeoAgent中,每个图像块的状态由全局缩略图和位置掩码表示:全局缩略图提供图像块外的上下文信息,位置掩码则引导感知空间关系。尺度选择动作通过尺度控制代理执行。我们提出特征索引模块,增强代理区分当前图像块位置的能力。该动作可切换双分支分割网络中的图像块尺度和上下文分支,该网络用于提取并融合多尺度图像块的特征。GeoAgent根据所选尺度获得的奖励调整网络参数,以执行恰当的尺度选择动作。使用两个公开数据集及我们新构建的WUSU数据集进行的实验结果表明,GeoAgent优于以往的分割方法,尤其适用于大规模制图应用。