Neural Radiance Fields (NeRF) can be optimized to obtain high-fidelity 3D scene reconstructions of objects and large-scale scenes. However, NeRFs require accurate camera parameters as input -- inaccurate camera parameters result in blurry renderings. Extrinsic and intrinsic camera parameters are usually estimated using Structure-from-Motion (SfM) methods as a pre-processing step to NeRF, but these techniques rarely yield perfect estimates. Thus, prior works have proposed jointly optimizing camera parameters alongside a NeRF, but these methods are prone to local minima in challenging settings. In this work, we analyze how different camera parameterizations affect this joint optimization problem, and observe that standard parameterizations exhibit large differences in magnitude with respect to small perturbations, which can lead to an ill-conditioned optimization problem. We propose using a proxy problem to compute a whitening transform that eliminates the correlation between camera parameters and normalizes their effects, and we propose to use this transform as a preconditioner for the camera parameters during joint optimization. Our preconditioned camera optimization significantly improves reconstruction quality on scenes from the Mip-NeRF 360 dataset: we reduce error rates (RMSE) by 67% compared to state-of-the-art NeRF approaches that do not optimize for cameras like Zip-NeRF, and by 29% relative to state-of-the-art joint optimization approaches using the camera parameterization of SCNeRF. Our approach is easy to implement, does not significantly increase runtime, can be applied to a wide variety of camera parameterizations, and can straightforwardly be incorporated into other NeRF-like models.
翻译:神经辐射场(NeRF)可通过优化获得物体和大尺度场景的高保真三维重建。然而,NeRF需要精确的相机参数作为输入——不准确的相机参数会导致模糊的渲染结果。相机的外参和内参通常通过运动恢复结构(SfM)方法作为NeRF的预处理步骤进行估计,但这些技术很少能得到完美估计。因此,先前工作提出在优化NeRF的同时联合优化相机参数,但这些方法在具有挑战性的场景中容易陷入局部极小值。本文分析了不同相机参数化方式如何影响这一联合优化问题,并观察到标准参数化方式在小扰动下存在显著的量级差异,这可能导致病态优化问题。我们提出使用代理问题计算白化变换,以消除相机参数间的相关性并归一化其影响,并建议在联合优化过程中将该变换作为相机参数的预条件器。我们的预条件相机优化显著提升了Mip-NeRF 360数据集中场景的重建质量:与不优化相机的先进NeRF方法(如Zip-NeRF)相比,我们将误差率(RMSE)降低了67%;与使用SCNeRF相机参数化的先进联合优化方法相比,误差率降低了29%。我们的方法易于实现,不会显著增加运行时间,可广泛适用于多种相机参数化方式,并能直接集成到其他类NeRF模型中。