In this paper, we propose a fully supervised pre-training scheme based on contrastive learning particularly tailored to dense classification tasks. The proposed Context-Self Contrastive Loss (CSCL) learns an embedding space that makes semantic boundaries pop-up by use of a similarity metric between every location in a training sample and its local context. For crop type semantic segmentation from Satellite Image Time Series (SITS) we find performance at parcel boundaries to be a critical bottleneck and explain how CSCL tackles the underlying cause of that problem, improving the state-of-the-art performance in this task. Additionally, using images from the Sentinel-2 (S2) satellite missions we compile the largest, to our knowledge, SITS dataset densely annotated by crop type and parcel identities, which we make publicly available together with the data generation pipeline. Using that data we find CSCL, even with minimal pre-training, to improve all respective baselines and present a process for semantic segmentation at super-resolution for obtaining crop classes at a more granular level. The code and instructions to download the data can be found in https://github.com/michaeltrs/DeepSatModels.
翻译:本文提出了一种专为密集分类任务设计的全监督预训练方案,其基于对比学习框架。所提出的上下文自对比损失(CSCL)通过度量训练样本中每个位置与其局部上下文之间的相似性,学习能够凸显语义边界的嵌入空间。针对卫星图像时间序列(SITS)的作物类型语义分割,我们发现地块边界处的性能是关键瓶颈,并阐释了CSCL如何解决该问题的根本原因,从而提升了该任务的现有最优性能。此外,利用哨兵二号(S2)卫星任务的影像,我们构建了据我们所知规模最大、按作物类型和地块身份进行密集标注的SITS数据集,并将其与数据生成管线一同公开。基于该数据,我们发现即使经过最简预训练,CSCL仍能改进所有基线模型,并提出了用于获取更细粒度作物类别的超分辨率语义分割流程。代码及数据下载说明见:https://github.com/michaeltrs/DeepSatModels。