Remote sensing images are useful for a wide variety of earth monitoring applications, from tracking deforestation to tackling illegal fishing. The earth is extremely diverse -- the amount of potential tasks in remote sensing images is massive, and the sizes of features range from several kilometers to just tens of centimeters. However, creating generalizable computer vision methods is a challenge in part due to the lack of a large-scale dataset that captures these diverse features for many tasks. In this paper, we present Satlas, a remote sensing dataset and benchmark that is large in both breadth and scale, comprising 302M labels under 137 categories and seven label types. We evaluate eight baselines and a proposed method on Satlas, and find that there is substantial room for improvement in addressing research challenges specific to remote sensing, including processing image time series that consist of images from very different types of sensors, and taking advantage of long-range spatial context. Moreover, we find that pre-training on Satlas substantially improves performance on downstream tasks, increasing average accuracy by 18% over ImageNet and 6% over the next best baseline.
翻译:遥感图像在多种地球监测应用中具有重要价值,从追踪森林砍伐到打击非法捕鱼。地球具有极高的多样性——遥感图像中潜在任务的数量庞大,地物特征尺度从数千米到仅几十厘米不等。然而,构建通用计算机视觉方法的一大挑战在于缺乏能够涵盖多种任务中这些多样化特征的大规模数据集。本文提出Satlas,一个在广度和规模上均实现突破的遥感数据集与基准,包含1.37亿个标注,覆盖137个类别和七种标注类型。我们在Satlas上评估了八种基线方法与一种提议方法,发现遥感领域特有的研究挑战仍有显著改进空间,包括处理由多种完全不同类型传感器组成的图像时间序列,以及利用长距离空间上下文信息。此外,我们发现在Satlas上进行预训练可显著提升下游任务性能,相较ImageNet平均准确率提升18%,较次优基线提升6%。