How Much of a Model Do We Need? Redundancy and Slimmability in Remote Sensing Foundation Models

Large-scale foundation models (FMs) in remote sensing (RS) are developed based on the paradigms established in computer vision (CV) and have shown promise for various Earth observation applications. However, the direct transfer of scaling assumptions from CV to RS has not been adequately examined. We hypothesize that RS FMs enter an overparameterized regime at substantially smaller scales than their CV counterparts, where increasing parameter count primarily induces redundant representations rather than qualitatively new abstractions. To test this hypothesis, we use post-hoc slimming, where we uniformly reduce the width of pretrained encoder, as a tool to measure representational redundancy across six state-of-the-art RS FMs on four downstream classification tasks. Our findings reveal a significant contrast with those in the CV domain: while a post-hoc slimmed masked autoencoder (MAE) trained on ImageNet retains less than 10% accuracy at 1% FLOPs, RS FMs maintain over 71% relative accuracy at the same budget. This sevenfold difference provides strong empirical support for our hypothesis. We further demonstrate that learned slimmable training can improve both Momentum Contrast (MoCo)- and MAE- based models. In addition, through the explained variance ratio and the feature correlation analysis, we provide mechanistic explanations showing that RS FMs distribute task-relevant information with high redundancy. Our findings establish post-hoc slimmability as both a practical deployment strategy for resource-constrained environments and a diagnostic tool that challenges the prevailing scaling paradigm in RS. Upon acceptance, we will publish all code.

翻译：遥感领域的大规模基础模型基于计算机视觉领域建立的范式开发，已在多种地球观测应用中展现出潜力。然而，将计算机视觉的尺度假设直接迁移至遥感领域的做法尚未得到充分验证。我们假设遥感基础模型在远小于计算机视觉模型的尺度下即进入过参数化状态，此时增加参数数量主要引发冗余表征而非质变的新抽象特征。为验证该假设，我们采用后置压缩方法——均匀缩减预训练编码器的宽度——作为衡量工具，在四项下游分类任务中对六种先进遥感基础模型的表征冗余度进行测量。研究结果揭示了与计算机视觉领域的显著差异：在ImageNet上训练的后置压缩掩码自编码器在1%计算量下精度保留率不足10%，而同等计算预算下遥感基础模型的相对精度仍保持71%以上。这七倍的差异为我们的假设提供了强有力的实证支持。我们进一步证明，可压缩学习训练能够同时改进基于动量对比学习与掩码自编码器的模型。此外，通过方差解释比与特征相关性分析，我们提供了机制性解释，表明遥感基础模型以高冗余度分布任务相关信息。本研究确立后置压缩技术兼具双重价值：既是资源受限环境下的实用部署策略，也是挑战当前遥感领域主流尺度范式的诊断工具。论文录用后，我们将公开全部代码。