We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a momentum-based camera extrinsic optimization procedure into Instant Neural Graphics Primitives, a recent exceptionally fast NeRF implementation. By introducing parallel Monte Carlo sampling into the pose estimation task, our method overcomes local minima and improves efficiency in a more extensive search space. We also show the importance of adopting a more robust pixel-based loss function to reduce error. Experiments demonstrate that our method can achieve improved generalization and robustness on both synthetic and real-world benchmarks.
翻译:我们提出了一种基于快速神经辐射场的并行优化方法,用于估计相机相对于物体或场景的6自由度姿态。给定目标的单张观测RGB图像,我们通过最小化快速NeRF模型渲染像素与观测图像像素之间的残差,来预测相机的平移和旋转。我们将基于动量的相机外参优化过程集成到即时神经图形基元中——这是近期一种极速NeRF实现。通过将并行蒙特卡洛采样引入姿态估计任务,我们的方法克服了局部极小值问题,并在更大搜索空间中提升了效率。我们还论证了采用更鲁棒的逐像素损失函数对降低误差的重要性。实验表明,该方法在合成与真实世界基准测试中均能实现更优的泛化性和鲁棒性。