Existing methods for farmland remote sensing image (FRSI) segmentation generally follow a static segmentation paradigm, where analysis relies solely on the limited information contained within a single input patch. Consequently, their reasoning capability is limited when dealing with complex scenes characterized by ambiguity and visual uncertainty. In contrast, human experts, when interpreting remote sensing images in such ambiguous cases, tend to actively query auxiliary images (such as higher-resolution, larger-scale, or temporally adjacent data) to conduct cross-verification and achieve more comprehensive reasoning. Inspired by this, we propose a reasoning-query-driven dynamic segmentation framework for FRSIs, named FarmMind. This framework breaks through the limitations of the static segmentation paradigm by introducing a reasoning-query mechanism, which dynamically and on-demand queries external auxiliary images to compensate for the insufficient information in a single input image. Unlike direct queries, this mechanism simulates the thinking process of human experts when faced with segmentation ambiguity: it first analyzes the root causes of segmentation ambiguities through reasoning, and then determines what type of auxiliary image needs to be queried based on this analysis. Extensive experiments demonstrate that FarmMind achieves superior segmentation performance and stronger generalization ability compared with existing methods. The source code and dataset used in this work are publicly available at: https://github.com/WithoutOcean/FarmMind.
翻译:现有农田遥感图像分割方法通常遵循静态分割范式,其分析仅依赖于单一输入图块所包含的有限信息。因此,在处理具有模糊性与视觉不确定性的复杂场景时,其推理能力受到限制。相比之下,人类专家在解释此类模糊遥感图像时,往往会主动查询辅助图像(如更高分辨率、更大尺度或时间相邻的数据)进行交叉验证,以实现更全面的推理。受此启发,我们提出一种面向农田遥感图像的推理查询驱动动态分割框架,命名为FarmMind。该框架通过引入推理查询机制,突破静态分割范式的局限,能够按需动态查询外部辅助图像以弥补单张输入图像的信息不足。与直接查询不同,该机制模拟了人类专家面对分割模糊性时的思维过程:首先通过推理分析分割模糊性的根源,进而根据分析结果确定需要查询何种类型的辅助图像。大量实验表明,与现有方法相比,FarmMind实现了更优的分割性能和更强的泛化能力。本工作所使用的源代码与数据集已在以下网址公开:https://github.com/WithoutOcean/FarmMind。