Neural Radiance Fields (NeRFs) aim to synthesize novel views of objects and scenes, given the object-centric camera views with large overlaps. However, we conjugate that this paradigm does not fit the nature of the street views that are collected by many self-driving cars from the large-scale unbounded scenes. Also, the onboard cameras perceive scenes without much overlapping. Thus, existing NeRFs often produce blurs, 'floaters' and other artifacts on street-view synthesis. In this paper, we propose a new street-view NeRF (S-NeRF) that considers novel view synthesis of both the large-scale background scenes and the foreground moving vehicles jointly. Specifically, we improve the scene parameterization function and the camera poses for learning better neural representations from street views. We also use the the noisy and sparse LiDAR points to boost the training and learn a robust geometry and reprojection based confidence to address the depth outliers. Moreover, we extend our S-NeRF for reconstructing moving vehicles that is impracticable for conventional NeRFs. Thorough experiments on the large-scale driving datasets (e.g., nuScenes and Waymo) demonstrate that our method beats the state-of-the-art rivals by reducing 7% to 40% of the mean-squared error in the street-view synthesis and a 45% PSNR gain for the moving vehicles rendering.
翻译:神经辐射场(NeRF)旨在通过具有大重叠量的以物体为中心的相机视图,合成物体和场景的新视角。然而,我们认为这一范式并不适用于大量自动驾驶汽车从大规模无界场景中采集的街景特性。此外,车载相机感知场景时视图重叠较少。因此,现有NeRF在街景合成中常产生模糊、‘漂浮物’及其他伪影。本文提出一种新型街景NeRF(S-NeRF),同时考虑大规模背景场景和前景运动车辆的新视角合成。具体而言,我们改进了场景参数化函数和相机位姿,以从街景中学习更优的神经表示。同时利用噪声稀疏的LiDAR点云提升训练效果,并学习基于稳健几何与重投影的置信度以处理深度异常值。此外,我们将S-NeRF扩展至传统NeRF难以实现的运动车辆重建。在大型驾驶数据集(如nuScenes和Waymo)上的充分实验表明,本方法在街景合成中将均方误差降低7%至40%,并在运动车辆渲染中实现45%的PSNR增益,全面超越现有最优方法。