Generating 3D city models rapidly is crucial for many applications. Monocular height estimation is one of the most efficient and timely ways to obtain large-scale geometric information. However, existing works focus primarily on training and testing models using unbiased datasets, which does not align well with real-world applications. Therefore, we propose a new benchmark dataset to study the transferability of height estimation models in a cross-dataset setting. To this end, we first design and construct a large-scale benchmark dataset for cross-dataset transfer learning on the height estimation task. This benchmark dataset includes a newly proposed large-scale synthetic dataset, a newly collected real-world dataset, and four existing datasets from different cities. Next, a new experimental protocol, few-shot cross-dataset transfer, is designed. Furthermore, in this paper, we propose a scale-deformable convolution module to enhance the window-based Transformer for handling the scale-variation problem in the height estimation task. Experimental results have demonstrated the effectiveness of the proposed methods in the traditional and cross-dataset transfer settings. The datasets and codes are publicly available at https://mediatum.ub.tum.de/1662763 and https://thebenchmarkh.github.io/.
翻译:快速生成3D城市模型对众多应用至关重要。单目高度估计是获取大规模几何信息最高效、最及时的方法之一。然而,现有研究主要关注使用无偏数据集训练和测试模型,这与实际应用场景存在偏差。为此,我们提出一个新的基准数据集,以研究高度估计模型在跨数据集场景下的可迁移性。首先,我们设计并构建了一个面向高度估计任务跨数据集迁移学习的大规模基准数据集,其中包含一个新提出的大规模合成数据集、一个新采集的真实世界数据集以及四个来自不同城市的现有数据集。其次,我们设计了一套新的实验协议——小样本跨数据集迁移。此外,本文提出了一种尺度可变卷积模块,用于增强基于窗口的Transformer,以解决高度估计任务中的尺度变化问题。实验结果表明,所提方法在传统设置和跨数据集迁移设置下均具有有效性。数据集和代码已公开在https://mediatum.ub.tum.de/1662763 和 https://thebenchmarkh.github.io/。