Self-Supervised Masked Digital Elevation Models Encoding for Low-Resource Downstream Tasks

The lack of quality labeled data is one of the main bottlenecks for training Deep Learning models. As the task increases in complexity, there is a higher penalty for overfitting and unstable learning. The typical paradigm employed today is Self-Supervised learning, where the model attempts to learn from a large corpus of unstructured and unlabeled data and then transfer that knowledge to the required task. Some notable examples of self-supervision in other modalities are BERT for Large Language Models, Wav2Vec for Speech Recognition, and the Masked AutoEncoder for Vision, which all utilize Transformers to solve a masked prediction task. GeoAI is uniquely poised to take advantage of the self-supervised methodology due to the decades of data collected, little of which is precisely and dependably annotated. Our goal is to extract building and road segmentations from Digital Elevation Models (DEM) that provide a detailed topography of the earths surface. The proposed architecture is the Masked Autoencoder pre-trained on ImageNet (with the limitation that there is a large domain discrepancy between ImageNet and DEM) with an UperNet Head for decoding segmentations. We tested this model with 450 and 50 training images only, utilizing roughly 5% and 0.5% of the original data respectively. On the building segmentation task, this model obtains an 82.1% Intersection over Union (IoU) with 450 Images and 69.1% IoU with only 50 images. On the more challenging road detection task the model obtains an 82.7% IoU with 450 images and 73.2% IoU with only 50 images. Any hand-labeled dataset made today about the earths surface will be immediately obsolete due to the constantly changing nature of the landscape. This motivates the clear necessity for data-efficient learners that can be used for a wide variety of downstream tasks.

翻译：高质量标注数据的匮乏是深度学习模型训练的主要瓶颈之一。随着任务复杂度提升，过拟合与不稳定学习带来的代价愈发显著。当前的主流范式是自监督学习——模型先通过海量非结构化无标注数据学习表征，再将知识迁移至目标任务。其他模态中的典型实例包括大型语言模型的BERT、语音识别的Wav2Vec以及视觉领域的掩码自编码器，这些模型均采用Transformer架构解决掩码预测任务。地理空间人工智能（GeoAI）在自监督方法上具有独特优势，因其积累数十年观测数据却鲜有精准可靠的人工标注。本研究的目标是从提供地球表面精细地形信息的数字高程模型（DEM）中提取建筑物与道路的分割结果。所提架构采用在ImageNet上预训练的掩码自编码器（尽管ImageNet与DEM存在显著领域差异），并叠加UperNet解码头实现分割。我们仅用450张和50张训练图像（分别约占原始数据的5%和0.5%）进行测试。在建筑物分割任务中，该模型在450张图像下获得82.1%的交并比（IoU），在50张图像下仍达69.1% IoU；更具挑战性的道路检测任务中，模型在450张图像下取得82.7% IoU，即使在50张图像下也达到73.2% IoU。由于地表景观持续变迁，当前任何人工标注的地球表面数据集都会迅速过时，这充分凸显了开发数据高效型学习器以服务多样化下游任务的必要性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日