Achieving accurate, efficient, and consistent localization within an a priori environment map remains a fundamental challenge in robotics and computer vision. Conventional map-based keyframe localization often suffers from sub-optimal viewpoints due to limited field of view (FOV), thus degrading its performance. To address this issue, in this paper, we design a real-time tightly-coupled Neural Radiance Fields (NeRF)-aided visual-inertial navigation system (VINS), termed NeRF-VINS. By effectively leveraging NeRF's potential to synthesize novel views, essential for addressing limited viewpoints, the proposed NeRF-VINS optimally fuses IMU and monocular image measurements along with synthetically rendered images within an efficient filter-based framework. This tightly coupled integration enables 3D motion tracking with bounded error. We extensively compare the proposed NeRF-VINS against the state-of-the-art methods that use prior map information, which is shown to achieve superior performance. We also demonstrate the proposed method is able to perform real-time estimation at 15 Hz, on a resource-constrained Jetson AGX Orin embedded platform with impressive accuracy.
翻译:在机器人学与计算机视觉领域,如何在先验环境地图中实现精确、高效且一致的定位仍是一项基础性挑战。传统基于地图的关键帧定位常因视场角受限而遭遇次优视角问题,进而降低其性能。为解决这一问题,本文设计了一种实时紧耦合神经辐射场辅助视觉惯性导航系统——NeRF-VINS。通过有效利用NeRF合成新视角的潜力(这对解决有限视点问题至关重要),所提出的NeRF-VINS在高效滤波框架内,将惯性测量单元与单目图像测量值以及合成渲染图像进行最优融合。这种紧耦合集成实现了具有有界误差的三维运动跟踪。我们通过大量实验将所提出的NeRF-VINS与利用先验地图信息的现有最优方法进行对比,证明其性能更优。同时,在资源受限的Jetson AGX Orin嵌入式平台上,所提方法能以15Hz频率实现实时估计,并展现出令人瞩目的精度。