Context Matters: Leveraging Spatiotemporal Metadata for Semi-Supervised Learning on Remote Sensing Images

Remote sensing projects typically generate large amounts of imagery that can be used to train powerful deep neural networks. However, the amount of labeled images is often small, as remote sensing applications generally require expert labelers. Thus, semi-supervised learning (SSL), i.e., learning with a small pool of labeled and a larger pool of unlabeled data, is particularly useful in this domain. Current SSL approaches generate pseudo-labels from model predictions for unlabeled samples. As the quality of these pseudo-labels is crucial for performance, utilizing additional information to improve pseudo-label quality yields a promising direction. For remote sensing images, geolocation and recording time are generally available and provide a valuable source of information as semantic concepts, such as land cover, are highly dependent on spatiotemporal context, e.g., due to seasonal effects and vegetation zones. In this paper, we propose to exploit spatiotemporal metainformation in SSL to improve the quality of pseudo-labels and, therefore, the final model performance. We show that directly adding the available metadata to the input of the predictor at test time degenerates the prediction quality for metadata outside the spatiotemporal distribution of the training set. Thus, we propose a teacher-student SSL framework where only the teacher network uses metainformation to improve the quality of pseudo-labels on the training set. Correspondingly, our student network benefits from the improved pseudo-labels but does not receive metadata as input, making it invariant to spatiotemporal shifts at test time. Furthermore, we propose methods for encoding and injecting spatiotemporal information into the model and introduce a novel distillation mechanism to enhance the knowledge transfer between teacher and student. Our framework dubbed Spatiotemporal SSL can be easily combined with several stat...

翻译：遥感项目通常会生成大量可用于训练强大深度神经网络的影像数据。然而，由于遥感应用通常需要专业标注人员，带标注的图像数量往往较少。因此，半监督学习（SSL）即利用少量标注数据和大量未标注数据进行学习的方法，在此领域特别适用。当前SSL方法从模型对未标注样本的预测中生成伪标签。由于这些伪标签的质量对性能至关重要，利用额外信息来提升伪标签质量是一个有前景的方向。对于遥感图像，地理定位和记录时间通常可用，并提供了有价值的信息源，因为语义概念（如土地覆盖）高度依赖于时空背景（例如季节性效应和植被带）。本文提出在SSL中利用时空元信息来提高伪标签质量，从而提升最终模型性能。研究表明，在测试时直接将可用元数据添加到预测器输入中，会导致训练集时空分布外元数据的预测质量下降。因此，我们提出一种教师-学生SSL框架，其中只有教师网络使用元信息来提高训练集上的伪标签质量。相应地，我们的学生网络受益于改进的伪标签，但不接收元数据作为输入，从而在测试时对时空偏移具有不变性。此外，我们提出了将时空信息编码并注入模型的方法，并引入了一种新的蒸馏机制来增强教师与学生之间的知识迁移。我们的框架称为时空SSL，可以轻松与多种现有...