BirdNeRF: Fast Neural Reconstruction of Large-Scale Scenes From Aerial Imagery

In this study, we introduce BirdNeRF, an adaptation of Neural Radiance Fields (NeRF) designed specifically for reconstructing large-scale scenes using aerial imagery. Unlike previous research focused on small-scale and object-centric NeRF reconstruction, our approach addresses multiple challenges, including (1) Addressing the issue of slow training and rendering associated with large models. (2) Meeting the computational demands necessitated by modeling a substantial number of images, requiring extensive resources such as high-performance GPUs. (3) Overcoming significant artifacts and low visual fidelity commonly observed in large-scale reconstruction tasks due to limited model capacity. Specifically, we present a novel bird-view pose-based spatial decomposition algorithm that decomposes a large aerial image set into multiple small sets with appropriately sized overlaps, allowing us to train individual NeRFs of sub-scene. This decomposition approach not only decouples rendering time from the scene size but also enables rendering to scale seamlessly to arbitrarily large environments. Moreover, it allows for per-block updates of the environment, enhancing the flexibility and adaptability of the reconstruction process. Additionally, we propose a projection-guided novel view re-rendering strategy, which aids in effectively utilizing the independently trained sub-scenes to generate superior rendering results. We evaluate our approach on existing datasets as well as against our own drone footage, improving reconstruction speed by 10x over classical photogrammetry software and 50x over state-of-the-art large-scale NeRF solution, on a single GPU with similar rendering quality.

翻译：本研究提出BirdNeRF，一种专为航空影像大规模场景重建设计的神经辐射场（NeRF）自适应方法。与以往聚焦于小规模和以物体为中心的NeRF重建研究不同，本方法解决了以下挑战：（1）大模型训练与渲染速度缓慢的问题；（2）建模大量图像所需的高计算需求，如高性能GPU等大规模资源；（3）因模型容量有限，大规模重建中常见显著伪影与低视觉保真度的问题。具体而言，我们提出一种新颖的基于鸟瞰视角的空间分解算法，将大规模航空图像集分解为多个具有适当重叠的小规模子集，从而实现对子场景的独立NeRF训练。该分解方法不仅使渲染时间与场景规模解耦，还能无缝扩展至任意规模环境的渲染任务，并支持环境的逐块更新，增强了重建过程的灵活性与适应性。此外，我们提出一种投影引导的新视图重渲染策略，有效利用独立训练的子场景生成更优质的渲染结果。基于现有数据集及自有无人机航拍影像的评估表明，在单GPU上，本方法在保持相近渲染质量的同时，重建速度比传统摄影测量软件提升10倍，比当前最优的大规模NeRF方案提升50倍。