Neural Radiance Fields (NeRF) enable 3D scene reconstruction from 2D images and camera poses for Novel View Synthesis (NVS). Although NeRF can produce photorealistic results, it often suffers from overfitting to training views, leading to poor geometry reconstruction, especially in low-texture areas. This limitation restricts many important applications which require accurate geometry, such as extrapolated NVS, HD mapping and scene editing. To address this limitation, we propose a new method to improve NeRF's 3D structure using only RGB images and semantic maps. Our approach introduces a novel plane regularization based on Singular Value Decomposition (SVD), that does not rely on any geometric prior. In addition, we leverage the Structural Similarity Index Measure (SSIM) in our loss design to properly initialize the volumetric representation of NeRF. Quantitative and qualitative results show that our method outperforms popular regularization approaches in accurate geometry reconstruction for large-scale outdoor scenes and achieves SoTA rendering quality on the KITTI-360 NVS benchmark.
翻译:神经辐射场(NeRF)能够从二维图像和相机位姿中实现三维场景重建,用于新视角合成(NVS)。尽管NeRF可以生成逼真的结果,但它常因过度拟合训练视角而导致几何重建质量低下,尤其是在低纹理区域。这一缺陷限制了诸多需要精确几何信息的重要应用,例如外推式NVS、高清地图构建和场景编辑。为解决此问题,我们提出了一种新方法,仅利用RGB图像和语义图即可改善NeRF的三维结构。该方法引入了一种基于奇异值分解(SVD)的新型平面正则化技术,无需依赖任何几何先验。此外,我们在损失函数设计中结合了结构相似性指数度量(SSIM),以恰当初始化NeRF的体素表示。定量和定性结果表明,在大规模室外场景的精确几何重建方面,我们的方法优于主流正则化技术,并在KITTI-360 NVS基准测试中达到了最先进的渲染质量。