Accurate measurement of the offset from roof-to-footprint in very-high-resolution remote sensing imagery is crucial for urban information extraction tasks. With the help of deep learning, existing methods typically rely on two-stage CNN models to extract regions of interest on building feature maps. At the first stage, a Region Proposal Network (RPN) is applied to extract thousands of ROIs (Region of Interests) which will post-imported into a Region-based Convolutional Neural Networks (RCNN) to extract wanted information. However, because of inflexible RPN, these methods often lack effective user interaction, encounter difficulties in instance correspondence, and struggle to keep up with the advancements in general artificial intelligence. This paper introduces an interactive Transformer model combined with a prompt encoder to precisely extract building segmentation as well as the offset vectors from roofs to footprints. In our model, a powerful module, namely ROAM, was tailored for common problems in predicting roof-to-footprint offsets. We tested our model's feasibility on the publicly available BONAI dataset, achieving a significant reduction in Prompt-Instance-Level offset errors ranging from 14.6% to 16.3%. Additionally, we developed a Distance-NMS algorithm tailored for large-scale building offsets, significantly enhancing the accuracy of predicted building offset angles and lengths in a straightforward and efficient manner. To further validate the model's robustness, we created a new test set using 0.5m remote sensing imagery from Huizhou, China, for inference testing. Our code, training methods, and the updated dataset will be accessable at https://github.com/likaiucas.
翻译:精准测量超高分辨率遥感影像中屋顶到基底的空间偏移量,是城市信息提取任务的关键环节。借助深度学习技术,现有方法通常采用两阶段卷积神经网络模型,在建筑特征图上提取感兴趣区域。第一阶段通过区域提议网络提取数千个感兴趣区域,随后将其输入基于区域的卷积神经网络以提取目标信息。然而,由于区域提议网络的灵活性不足,这类方法难以实现有效的用户交互,在实例对应方面存在困难,且难以跟上通用人工智能的发展步伐。本文提出一种融合提示编码器的交互式Transformer模型,可精确提取建筑分割结果及其屋顶到基底的偏移矢量。针对预测屋顶-基底偏移中的常见问题,我们定制化设计了ROAM模块。在公开BONAI数据集上的实验验证了模型可行性,提示实例级偏移误差显著降低14.6%至16.3%。此外,我们开发了面向大规模建筑偏移的Distance-NMS算法,以简洁高效的方式显著提升建筑偏移角度与长度的预测精度。为验证模型鲁棒性,我们采用中国惠州0.5米分辨率遥感影像构建新测试集进行推理测试。相关代码、训练方法及更新数据集将在https://github.com/likaiucas 开放获取。