Geo-Foundation Models (GFMs), have proven effective in diverse downstream applications, including semantic segmentation, classification, and regression tasks. However, in case of flood mapping using Sen1Flood11 dataset as a downstream task, GFMs struggles to outperform the baseline U-Net, highlighting model's limitation in capturing critical local nuances. To address this, we present the Prithvi-Complementary Adaptive Fusion Encoder (CAFE), which integrate Prithvi GFM pretrained encoder with a parallel CNN residual branch enhanced by Convolutional Attention Modules (CAM). Prithvi-CAFE enables fast and efficient fine-tuning through adapters in Prithvi and performs multi-scale, multi-level fusion with CNN features, capturing critical local details while preserving long-range dependencies. We achieve state-of-the-art results on two comprehensive flood mapping datasets: Sen1Flood11 and FloodPlanet. On Sen1Flood11 test data, Prithvi-CAFE (IoU 83.41) outperforms the original Prithvi (IoU 82.50) and other major GFMs (TerraMind 82.90, DOFA 81.54, spectralGPT: 81.02). The improvement is even more pronounced on the hold-out test site, where Prithvi-CAFE achieves an IoU of 81.37 compared to the baseline U-Net (70.57) and original Prithvi (72.42). On FloodPlanet, Prithvi-CAFE also surpasses the baseline U-Net and other GFMs, achieving an IoU of 64.70 compared to U-Net (60.14), Terramind (62.33), DOFA (59.15) and Prithvi 2.0 (61.91). Our proposed simple yet effective Prithvi-CAFE demonstrates strong potential for improving segmentation tasks where multi-channel and multi-modal data provide complementary information and local details are critical. The code is released on \href{https://github.com/Sk-2103/Prithvi-CAFE}{Prithvi-CAFE Github}
翻译:地理基础模型(GFMs)已被证明在多种下游应用中有效,包括语义分割、分类和回归任务。然而,在使用Sen1Flood11数据集进行洪水制图这一下游任务时,GFMs难以超越基线U-Net模型,突显了模型在捕捉关键局部细节方面的局限性。为解决此问题,我们提出了Prithvi-互补自适应融合编码器(CAFE),它将预训练的Prithvi GFM编码器与一个由卷积注意力模块(CAM)增强的并行CNN残差分支相集成。Prithvi-CAFE通过Prithvi中的适配器实现快速高效的微调,并与CNN特征进行多尺度、多层级融合,从而在保持长程依赖关系的同时捕捉关键的局部细节。我们在两个综合性洪水制图数据集上取得了最先进的结果:Sen1Flood11和FloodPlanet。在Sen1Flood11测试数据上,Prithvi-CAFE(IoU 83.41)优于原始Prithvi(IoU 82.50)和其他主要GFMs(TerraMind 82.90,DOFA 81.54,spectralGPT: 81.02)。在保留测试站点上的改进更为显著,Prithvi-CAFE的IoU达到81.37,而基线U-Net为70.57,原始Prithvi为72.42。在FloodPlanet数据集上,Prithvi-CAFE同样超越了基线U-Net和其他GFMs,其IoU为64.70,而U-Net为60.14,Terramind为62.33,DOFA为59.15,Prithvi 2.0为61.91。我们提出的Prithvi-CAFE方法简单而有效,展示了其在改进分割任务方面的强大潜力,尤其适用于多通道和多模态数据提供互补信息且局部细节至关重要的场景。代码已发布于\href{https://github.com/Sk-2103/Prithvi-CAFE}{Prithvi-CAFE Github}。