Neural Radiance Fields (NeRF) have achieved impressive results in 3D reconstruction and novel view generation. A significant challenge within NeRF involves editing reconstructed 3D scenes, such as object removal, which demands consistency across multiple views and the synthesis of high-quality perspectives. Previous studies have integrated depth priors, typically sourced from LiDAR or sparse depth estimates from COLMAP, to enhance NeRF's performance in object removal. However, these methods are either expensive or time-consuming. This paper proposes a new pipeline that leverages SpinNeRF and monocular depth estimation models like ZoeDepth to enhance NeRF's performance in complex object removal with improved efficiency. A thorough evaluation of COLMAP's dense depth reconstruction on the KITTI dataset is conducted to demonstrate that COLMAP can be viewed as a cost-effective and scalable alternative for acquiring depth ground truth compared to traditional methods like LiDAR. This serves as the basis for evaluating the performance of monocular depth estimation models to determine the best one for generating depth priors for SpinNeRF. The new pipeline is tested in various scenarios involving 3D reconstruction and object removal, and the results indicate that our pipeline significantly reduces the time required for the acquisition of depth priors for object removal and enhances the fidelity of the synthesized views, suggesting substantial potential for building high-fidelity digital twin systems with increased efficiency in the future.
翻译:神经辐射场(NeRF)在三维重建和新视角生成方面取得了令人瞩目的成果。NeRF中的一个重要挑战涉及对重建的三维场景进行编辑,例如物体移除,这需要跨多个视角的一致性以及高质量视角的合成。先前的研究通常整合来自LiDAR或COLMAP稀疏深度估计的深度先验,以增强NeRF在物体移除中的性能。然而,这些方法要么成本高昂,要么耗时较长。本文提出了一种新流程,利用SpinNeRF和单目深度估计模型(如ZoeDepth)来提升NeRF在复杂物体移除中的性能,并提高了效率。通过在KITTI数据集上对COLMAP的稠密深度重建进行全面评估,我们证明相较于传统方法(如LiDAR),COLMAP可被视为一种经济高效且可扩展的获取深度真值替代方案。这为评估单目深度估计模型的性能奠定了基础,从而确定最适合为SpinNeRF生成深度先验的模型。新流程在涉及三维重建和物体移除的多种场景中进行了测试,结果表明,我们的流程显著减少了获取物体移除所需深度先验的时间,并提升了合成视角的保真度,这为未来构建高保真数字孪生系统并提高效率展现了巨大潜力。