Rethinking the U-Net, ResUnet, and U-Net3+ architectures with dual skip connections for building footprint extraction

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

The importance of building footprints and their inventory has been recognised as foundational spatial information for multiple societal problems. Extracting complex urban buildings involves the segmentation of very high-resolution (VHR) earth observation (EO) images. U-Net is a common deep learning network and foundation for its new incarnations like ResUnet, U-Net++ and U-Net3+ for such segmentation. The re-incarnations look for efficiency gain by re-designing the skip connection component and exploiting the multi-scale features in U-Net. However, skip connections do not always improve these networks and context information is lost in the multi-scale features. In this paper, we propose three novel dual skip connection mechanisms for U-Net, ResUnet, and U-Net3+. This deepens the feature maps forwarded by the skip connections to find a more accurate trade-off between context and localisation within these networks. The mechanisms are evaluated on feature maps of different scales in the three networks, producing nine new network configurations. The networks are evaluated against their original vanilla versions using four building footprint datasets (three existing and one new) of different spatial resolutions: VHR (0.3m), high-resolution (1m and 1.2m), and multi-resolution (0.3+0.6+1.2m). The proposed mechanisms report efficiency gain on five evaluation measures for U-Net and ResUnet, and up to 17.7% and 18.4% gain in F1 score and Intersection over Union (IoU) for U-Net3+. The codes will be available in a GitHub link after peer review.

翻译：建筑物足迹及其清单的重要性已被认为是解决多个社会问题的基础空间信息。提取复杂的城市建筑物涉及对极高分辨率（VHR）地球观测（EO）图像进行分割。U-Net是一种常见的深度学习网络，也是其新变体（如ResUnet、U-Net++和U-Net3+）用于此类分割的基础。这些变体通过重新设计跳跃连接组件并利用U-Net中的多尺度特征来寻求效率提升。然而，跳跃连接并不总能改善这些网络，且多尺度特征中的上下文信息会丢失。本文中，我们针对U-Net、ResUnet和U-Net3+提出了三种新颖的双重跳跃连接机制。这加深了由跳跃连接转发的特征图，以便在这些网络中找到上下文与定位之间更精确的权衡。这些机制在三个网络的不同尺度特征图上进行了评估，生成了九种新的网络配置。我们使用四个不同空间分辨率的建筑物足迹数据集（三个现有数据集和一个新数据集）将这些网络与其原始标准版本进行了对比评估：极高分辨率（0.3米）、高分辨率（1米和1.2米）以及多分辨率（0.3+0.6+1.2米）。所提出的机制在U-Net和ResUnet的五个评估指标上报告了效率提升，而U-Net3+在F1分数和交并比（IoU）上分别获得了高达17.7%和18.4%的提升。相关代码将在同行评审后通过GitHub链接提供。