Satellite imagery is being leveraged for many societally critical tasks across climate, economics, and public health. Yet, because of heterogeneity in landscapes (e.g. how a road looks in different places), models can show disparate performance across geographic areas. Given the important potential of disparities in algorithmic systems used in societal contexts, here we consider the risk of urban-rural disparities in identification of land-cover features. This is via semantic segmentation (a common computer vision task in which image regions are labelled according to what is being shown) which uses pre-trained image representations generated via contrastive self-supervised learning. We propose fair dense representation with contrastive learning (FairDCL) as a method for de-biasing the multi-level latent space of convolution neural network models. The method improves feature identification by removing spurious model representations which are disparately distributed across urban and rural areas, and is achieved in an unsupervised way by contrastive pre-training. The obtained image representation mitigates downstream urban-rural prediction disparities and outperforms state-of-the-art baselines on real-world satellite images. Embedding space evaluation and ablation studies further demonstrate FairDCL's robustness. As generalizability and robustness in geographic imagery is a nascent topic, our work motivates researchers to consider metrics beyond average accuracy in such applications.
翻译:卫星图像正被广泛应用于气候、经济和公共卫生等众多社会关键任务中。然而,由于景观的异质性(例如道路在不同地区的外观差异),模型在不同地理区域可能表现出显著差异的性能。考虑到算法系统在社会应用场景中可能存在的差异具有重要影响,本文重点探讨了在土地覆盖特征识别中存在的城乡差异风险。该研究通过语义分割(一种常见的计算机视觉任务,即根据图像内容对区域进行标注)实现,并使用了通过对比自监督学习生成的预训练图像表示。我们提出了基于对比学习的公平密集表示方法(FairDCL),用于消除卷积神经网络模型多层次潜在空间中的偏差。该方法通过消除在城乡区域间分布不均的虚假模型表示来改进特征识别,并通过对比预训练以无监督方式实现。所获得的图像表示有效缓解了下游任务中的城乡预测差异,并在真实卫星图像上超越了现有先进基线方法。嵌入空间评估与消融实验进一步证明了FairDCL的鲁棒性。由于地理图像泛化性与鲁棒性研究尚处于起步阶段,我们的工作呼吁研究者在相关应用中关注超越平均准确率的评估指标。