In this work, we use multi-view aerial images to reconstruct the geometry, lighting, and material of facades using neural signed distance fields (SDFs). Without the requirement of complex equipment, our method only takes simple RGB images captured by a drone as inputs to enable physically based and photorealistic novel-view rendering, relighting, and editing. However, a real-world facade usually has complex appearances ranging from diffuse rocks with subtle details to large-area glass windows with specular reflections, making it hard to attend to everything. As a result, previous methods can preserve the geometry details but fail to reconstruct smooth glass windows or verse vise. In order to address this challenge, we introduce three spatial- and semantic-adaptive optimization strategies, including a semantic regularization approach based on zero-shot segmentation techniques to improve material consistency, a frequency-aware geometry regularization to balance surface smoothness and details in different surfaces, and a visibility probe-based scheme to enable efficient modeling of the local lighting in large-scale outdoor environments. In addition, we capture a real-world facade aerial 3D scanning image set and corresponding point clouds for training and benchmarking. The experiment demonstrates the superior quality of our method on facade holistic inverse rendering, novel view synthesis, and scene editing compared to state-of-the-art baselines.
翻译:本工作使用多视角航空图像,通过神经有符号距离场(SDFs)重建立面的几何、光照和材质。无需复杂设备,我们的方法仅以无人机拍摄的简单RGB图像作为输入,即可实现基于物理的光照真实感新视角渲染、重光照和编辑。然而,真实世界立面通常具有复杂外观,从具有细微细节的漫反射岩石到大面积高反射玻璃窗,使得全面兼顾十分困难。因此,现有方法虽能保留几何细节,却难以重建光滑玻璃窗,反之亦然。针对这一挑战,我们提出三种空间与语义自适应优化策略:基于零样本分割技术的语义正则化方法以提升材质一致性;频率感知几何正则化以平衡不同表面的平滑度与细节;以及基于可见性探针的方案以实现大规模室外场景中局部光照的高效建模。此外,我们采集了一套真实世界立面航空3D扫描图像集及对应点云,用于训练与基准测试。实验表明,与最先进基线方法相比,我们的方法在立面全景逆渲染、新视角合成及场景编辑方面具有更优质量。