Rapid post-event landslide mapping is essential for disaster response but remains difficult to automate due to extreme class imbalance. This study evaluates whether Clay v1.5, a Geo-Foundational Model (GFM), can improve pixel-level landslide segmentation on the Landslide4Sense (L4S) benchmark, which contains 3,799 training chips with 14 Sentinel-2 and terrain bands and approximately 2% positive pixels. We compare three strategies: Clay as the primary encoder with multi-scale residual terrain fusion, a U-Net backbone augmented with Clay semantic context at the bottleneck, and a standard U-Net baseline. The hybrid U-Net + Clay model with two-stage Low-Rank Adaptation (LoRA) achieved the best test F1 of 64.5 +/- 1.8% over three seeds, surpassing the Clay-only backbone (55.2 +/- 3.6%) and the U-Net baseline (59.9%). Clay as a standalone encoder underperformed the U-Net due to the absence of multi-scale skip connections, but its pretrained representations consistently improved performance when injected as auxiliary context. These findings suggest that GFMs are most effective for landslide detection when they complement spatially detailed convolutional architectures rather than replace them.
翻译:快速灾后滑坡制图对灾害响应至关重要,但由于极端类别不平衡问题,其自动化仍面临挑战。本研究评估了地理基础模型Clay v1.5能否提升Landslide4Sense基准数据集上的像素级滑坡分割性能。该数据集包含3,799个训练区块(含14个Sentinel-2波段与地形波段),其中阳性像素占比约2%。我们比较了三种策略:以Clay作为主编码器并融合多尺度残差地形特征、采用在瓶颈层注入Clay语义上下文的U-Net骨干网络、以及标准U-Net基线模型。采用两阶段低秩适配的混合U-Net+Clay模型在三次随机试验中取得最佳测试F1分数64.5±1.8%,显著优于仅使用Clay骨干的模型(55.2±3.6%)和U-Net基线(59.9%)。由于缺乏多尺度跳跃连接,Clay作为独立编码器时的性能低于U-Net,但其预训练表征作为辅助上下文注入时可持续提升模型性能。这些发现表明,地理基础模型在滑坡检测中最有效的应用方式是补充而非替代空间细节丰富的卷积架构。