Due to the ability to synthesize high-quality novel views, Neural Radiance Fields (NeRF) have been recently exploited to improve visual localization in a known environment. However, the existing methods mostly utilize NeRFs for data augmentation to improve the regression model training, and the performance on novel viewpoints and appearances is still limited due to the lack of geometric constraints. In this paper, we propose a novel visual localization framework, \ie, PNeRFLoc, based on a unified point-based representation. On the one hand, PNeRFLoc supports the initial pose estimation by matching 2D and 3D feature points as traditional structure-based methods; on the other hand, it also enables pose refinement with novel view synthesis using rendering-based optimization. Specifically, we propose a novel feature adaption module to close the gaps between the features for visual localization and neural rendering. To improve the efficacy and efficiency of neural rendering-based optimization, we also develop an efficient rendering-based framework with a warping loss function. Furthermore, several robustness techniques are developed to handle illumination changes and dynamic objects for outdoor scenarios. Experiments demonstrate that PNeRFLoc performs the best on synthetic data when the NeRF model can be well learned and performs on par with the SOTA method on the visual localization benchmark datasets.
翻译:由于神经辐射场(NeRF)能够合成高质量的新视角图像,近年来被用于改进已知环境中的视觉定位。然而,现有方法主要将NeRF用于数据增强以提升回归模型训练,因缺乏几何约束,其在新颖视角和外观下的性能仍存在局限。本文提出一种基于统一点表示的视觉定位框架PNeRFLoc。一方面,PNeRFLoc通过匹配2D与3D特征点(如传统基于结构的方法)支持初始位姿估计;另一方面,它利用基于渲染的优化结合新视角合成实现位姿精化。具体而言,我们提出一种新颖的特征适配模块,以弥合视觉定位与神经渲染之间的特征差异。为提升基于神经渲染优化的效率与效能,我们还开发了一种结合扭曲损失函数的高效渲染框架。此外,针对室外场景中的光照变化和动态物体,我们设计了多种鲁棒性技术。实验表明,当NeRF模型能够充分学习时,PNeRFLoc在合成数据上表现最优,并且在视觉定位基准数据集上达到与当前最优方法相当的性能。