Semantic segmentation techniques for extracting building footprints from high-resolution remote sensing images have been widely used in many fields such as urban planning. However, large-scale building extraction demands higher diversity in training samples. In this paper, we construct a Global Building Semantic Segmentation (GBSS) dataset (The dataset will be released), which comprises 116.9k pairs of samples (about 742k buildings) from six continents. There are significant variations of building samples in terms of size and style, so the dataset can be a more challenging benchmark for evaluating the generalization and robustness of building semantic segmentation models. We validated through quantitative and qualitative comparisons between different datasets, and further confirmed the potential application in the field of transfer learning by conducting experiments on subsets.
翻译:基于语义分割技术从高分辨率遥感影像中提取建筑物轮廓,已广泛应用于城市规划等诸多领域。然而,大尺度建筑物提取对训练样本的多样性提出了更高要求。本文构建了全球建筑语义分割数据集(GBSS,该数据集将公开发布),其包含来自六大洲的116.9万对样本(约74.2万栋建筑)。这些建筑样本在尺寸与风格上存在显著差异,因此该数据集可作为更具挑战性的基准,用于评估建筑语义分割模型的泛化能力与鲁棒性。我们通过不同数据集间的定量与定性对比验证了其有效性,并在子集上开展实验,进一步证实了其在迁移学习领域的潜在应用价值。