Rapid post-event landslide mapping is essential for disaster response but remains difficult to automate due to extreme class imbalance. This study evaluates whether Clay v1.5, a Geospatial Foundation Model (GFM), can improve pixel-level landslide segmentation on the Landslide4Sense (L4S) benchmark, which contains 3,799 training chips with 14 Sentinel-2 and terrain bands and approximately 2% positive pixels. We compare three strategies: Clay as the primary encoder with multi-scale residual terrain fusion, a U-Net backbone augmented with Clay semantic context at the bottleneck, and a standard U-Net baseline. The hybrid U-Net + Clay model with two-stage Low-Rank Adaptation (LoRA) achieved the best test F1 of 64.5 +/- 1.8% over three seeds, surpassing the Clay-only backbone (55.2 +/- 3.6%) and the U-Net baseline (59.9%). Clay as a standalone encoder underperformed the U-Net due to the absence of multi-scale skip connections, but its pretrained representations consistently improved performance when injected as auxiliary context. These findings suggest that GFMs are most effective for landslide detection when they complement spatially detailed convolutional architectures rather than replace them.
翻译:灾后快速滑坡制图对灾害响应至关重要,但由于极端类别不平衡问题,自动化仍具有挑战性。本研究评估了地理空间基础模型(Geospatial Foundation Model, GFM)Clay v1.5能否提升Landslide4Sense(L4S)基准数据集上像素级滑坡分割的性能。该数据集包含3,799个训练样本,涵盖14个Sentinel-2波段和地形波段,其中正像素占比约2%。我们比较了三种策略:以Clay作为主编码器并融合多尺度残差地形信息、在U-Net瓶颈层注入Clay语义上下文的增强骨干网络,以及标准U-Net基线模型。结合两阶段低秩自适应的混合U-Net+Clay模型在三次随机种子试验中取得了最佳测试F1分数(64.5±1.8%),优于纯Clay骨干模型(55.2±3.6%)和U-Net基线模型(59.9%)。由于缺乏多尺度跳跃连接,Clay作为独立编码器的性能不及U-Net,但其预训练表示在作为辅助上下文注入时能持续提升模型性能。这些结果表明,在滑坡检测任务中,地理空间基础模型的最佳应用方式是补充而非替代具有空间细节的卷积架构。