Global semantic 3D understanding from single-view high-resolution remote sensing (RS) imagery is crucial for Earth Observation (EO). However, this task faces significant challenges due to the high costs of annotations and data collection, as well as geographically restricted data availability. To address these challenges, synthetic data offer a promising solution by being easily accessible and thus enabling the provision of large and diverse datasets. We develop a specialized synthetic data generation pipeline for EO and introduce SynRS3D, the largest synthetic RS 3D dataset. SynRS3D comprises 69,667 high-resolution optical images that cover six different city styles worldwide and feature eight land cover types, precise height information, and building change masks. To further enhance its utility, we develop a novel multi-task unsupervised domain adaptation (UDA) method, RS3DAda, coupled with our synthetic dataset, which facilitates the RS-specific transition from synthetic to real scenarios for land cover mapping and height estimation tasks, ultimately enabling global monocular 3D semantic understanding based on synthetic data. Extensive experiments on various real-world datasets demonstrate the adaptability and effectiveness of our synthetic dataset and proposed RS3DAda method. SynRS3D and related codes will be available.
翻译:从单视角高分辨率遥感影像进行全球语义三维理解对于地球观测至关重要。然而,由于标注和数据采集成本高昂,以及数据获取受地理限制,该任务面临重大挑战。为应对这些挑战,合成数据提供了一种有前景的解决方案,其易于获取的特性使得能够提供大规模且多样化的数据集。我们开发了一个专门用于地球观测的合成数据生成流程,并推出了SynRS3D,这是最大的合成遥感三维数据集。SynRS3D包含69,667张高分辨率光学图像,覆盖全球六种不同的城市风格,并具有八种土地覆盖类型、精确的高度信息以及建筑物变化掩码。为进一步提升其实用性,我们开发了一种新颖的多任务无监督域适应方法RS3DAda,该方法与我们的合成数据集相结合,促进了从合成到真实场景的、针对遥感应用的土地覆盖制图与高度估计任务的过渡,最终实现了基于合成数据的全球单目三维语义理解。在多个真实世界数据集上进行的大量实验证明了我们的合成数据集及所提出的RS3DAda方法的适应性和有效性。SynRS3D及相关代码将公开提供。