Trust-SSL: Additive-Residual Selective Invariance for Robust Aerial Self-Supervised Learning

Self-supervised learning (SSL) is a standard approach for representation learning in aerial imagery. Existing methods enforce invariance between augmented views, which works well when augmentations preserve semantic content. However, aerial images are frequently degraded by haze, motion blur, rain, and occlusion that remove critical evidence. Enforcing alignment between a clean and a severely degraded view can introduce spurious structure into the latent space. This study proposes a training strategy and architectural modification to enhance SSL robustness to such corruptions. It introduces a per-sample, per-factor trust weight into the alignment objective, combined with the base contrastive loss as an additive residual. A stop-gradient is applied to the trust weight instead of a multiplicative gate. While a multiplicative gate is a natural choice, experiments show it impairs the backbone, whereas our additive-residual approach improves it. Using a 200-epoch protocol on a 210,000-image corpus, the method achieves the highest mean linear-probe accuracy among six backbones on EuroSAT, AID, and NWPU-RESISC45 (90.20% compared to 88.46% for SimCLR and 89.82% for VICReg). It yields the largest improvements under severe information-erasing corruptions on EuroSAT (+19.9 points on haze at s=5 over SimCLR). The method also demonstrates consistent gains of +1 to +3 points in Mahalanobis AUROC on a zero-shot cross-domain stress test using BDD100K weather splits. Two ablations (scalar uncertainty and cosine gate) indicate the additive-residual formulation is the primary source of these improvements. An evidential variant using Dempster-Shafer fusion introduces interpretable signals of conflict and ignorance. These findings offer a concrete design principle for uncertainty-aware SSL. Code is publicly available at https://github.com/WadiiBoulila/trust-ssl.

翻译：自监督学习（SSL）是航空图像表征学习的标准方法。现有方法强制增强视图之间的不变性，这在增强保留语义内容时效果良好。然而，航空图像常受雾霾、运动模糊、雨雪和遮挡等降质影响，导致关键证据丢失。在干净视图与严重降质视图之间强制对齐，会在潜在空间中引入虚假结构。本研究提出一种训练策略和架构改进，以增强SSL对此类降质的鲁棒性。该方法在目标函数中引入逐样本、逐因子的信任权重，并结合基础对比损失作为加性残差。信任权重应用停止梯度而非乘法门控。尽管乘法门控是自然选择，但实验表明它会损害骨干网络，而我们的加性残差方法则能提升其性能。在21万张图像的语料库上采用200轮训练协议，该方法在EuroSAT、AID和NWPU-RESISC45六个骨干网络中获得最高平均线性探测准确率（90.20%，对比SimCLR的88.46%和VICReg的89.82%）。在EuroSAT严重信息擦除降质下（s=5雾霾场景相对SimCLR提升+19.9个百分点），该方法取得最大改进。在BDD100K天气分割的零样本跨域压力测试中，马氏距离AUROC持续提升+1至+3个百分点。两项消融实验（标量不确定性和余弦门控）表明加性残差公式是改进的主要来源。基于Dempster-Shafer融合的证据变体引入可解释的冲突与无知信号。这些发现为不确定性感知SSL提供了具体设计原则。代码开源地址：https://github.com/WadiiBoulila/trust-ssl。