Neural Radiance Fields (NeRF) enable 3D scene reconstruction from 2D images and camera poses for Novel View Synthesis (NVS). Although NeRF can produce photorealistic results, it often suffers from overfitting to training views, leading to poor geometry reconstruction, especially in low-texture areas. This limitation restricts many important applications which require accurate geometry, such as extrapolated NVS, HD mapping and scene editing. To address this limitation, we propose a new method to improve NeRF's 3D structure using only RGB images and semantic maps. Our approach introduces a novel plane regularization based on Singular Value Decomposition (SVD), that does not rely on any geometric prior. In addition, we leverage the Structural Similarity Index Measure (SSIM) in our loss design to properly initialize the volumetric representation of NeRF. Quantitative and qualitative results show that our method outperforms popular regularization approaches in accurate geometry reconstruction for large-scale outdoor scenes and achieves SoTA rendering quality on the KITTI-360 NVS benchmark.
翻译:神经辐射场(NeRF)能够从二维图像和相机姿态实现三维场景重建,用于新视角合成(NVS)。尽管NeRF可生成照片级真实结果,但常因对训练视角过拟合导致几何重建质量低下,尤其在低纹理区域。这一局限限制了诸多需要精确几何信息的重要应用,如外推式新视角合成、高清制图与场景编辑。为解决该问题,我们提出仅利用RGB图像与语义图增强NeRF三维结构的新方法。该方法基于奇异值分解(SVD)引入全新平面正则化,无需依赖任何几何先验。此外,我们在损失函数设计中采用结构相似性指数(SSIM)以合理初始化NeRF的体素表征。定量与定性结果表明,本方法在大规模户外场景的精确几何重建中优于主流正则化方案,并在KITTI-360新视角合成基准上达到最优渲染质量(SoTA)。