Rapid post-event landslide mapping is essential for disaster response but remains difficult to automate due to extreme class imbalance. This study evaluates whether Clay v1.5, a Geospatial Foundation Model (GFM), can improve pixel-level landslide segmentation on the Landslide4Sense (L4S) benchmark, which contains 3,799 training chips with 14 Sentinel-2 and terrain bands and approximately 2% positive pixels. We compare three strategies: Clay as the primary encoder with multi-scale residual terrain fusion, a U-Net backbone augmented with Clay semantic context at the bottleneck, and a standard U-Net baseline. The hybrid U-Net + Clay model with two-stage Low-Rank Adaptation (LoRA) achieved the best test F1 of 64.5 +/- 1.8% over three seeds, surpassing the Clay-only backbone (55.2 +/- 3.6%) and the U-Net baseline (59.9%). Clay as a standalone encoder underperformed the U-Net due to the absence of multi-scale skip connections, but its pretrained representations consistently improved performance when injected as auxiliary context. These findings suggest that GFMs are most effective for landslide detection when they complement spatially detailed convolutional architectures rather than replace them.
翻译:快速灾后滑坡制图对于灾害响应至关重要,但由于极端类别不平衡问题,其自动化依然困难。本研究评估了地理空间基础模型Clay v1.5能否提升Landslide4Sense基准数据集上的像素级滑坡分割性能。该数据集包含3,799个训练芯片,每个芯片含有14个Sentinel-2波段和地形波段,正像素占比约2%。我们比较了三种策略:将Clay作为主编码器并融合多尺度残余地形信息、在U-Net瓶颈层增强Clay语义上下文信息的骨干网络,以及标准U-Net基线模型。采用两阶段低秩自适应的混合U-Net+Clay模型在三次随机种子实验中取得了最佳测试F1得分(64.5±1.8%),优于仅使用Clay的骨干网络(55.2±3.6%)和U-Net基线模型(59.9%)。由于缺乏多尺度跳跃连接,Clay作为独立编码器的性能不及U-Net,但将其预训练表示作为辅助上下文注入时,能够持续提升模型性能。这些发现表明,在滑坡检测任务中,地理空间基础模型最有效的应用方式是补充而非替代具有空间细节的卷积架构。