Neural Radiance Fields (NeRFs) have made great success in representing complex 3D scenes with high-resolution details and efficient memory. Nevertheless, current NeRF-based pose estimators have no initial pose prediction and are prone to local optima during optimization. In this paper, we present LATITUDE: Global Localization with Truncated Dynamic Low-pass Filter, which introduces a two-stage localization mechanism in city-scale NeRF. In place recognition stage, we train a regressor through images generated from trained NeRFs, which provides an initial value for global localization. In pose optimization stage, we minimize the residual between the observed image and rendered image by directly optimizing the pose on tangent plane. To avoid convergence to local optimum, we introduce a Truncated Dynamic Low-pass Filter (TDLF) for coarse-to-fine pose registration. We evaluate our method on both synthetic and real-world data and show its potential applications for high-precision navigation in large-scale city scenes. Codes and data will be publicly available at https://github.com/jike5/LATITUDE.
翻译:神经辐射场(NeRF)在表示包含高分辨率细节和高效内存的复杂三维场景方面已取得巨大成功。然而,当前基于NeRF的位姿估计方法缺乏初始位姿预测,且在优化过程中容易陷入局部最优。本文提出LATITUDE:基于截断动态低通滤波的全局定位方法,该方法在城市级NeRF中引入两阶段定位机制。在位置识别阶段,我们通过训练后的NeRF生成图像来训练回归器,为全局定位提供初始值。在位姿优化阶段,我们通过在切平面上直接优化位姿,最小化观察图像与渲染图像之间的残差。为避免收敛至局部最优,我们引入截断动态低通滤波器(TDLF)实现从粗到细的位姿配准。我们在合成数据和真实数据上评估了该方法,并展示了其在大型城市场景中高精度导航的潜在应用。代码与数据将开源至https://github.com/jike5/LATITUDE。