Neural field-based 3D representations have recently been adopted in many areas including SLAM systems. Current neural SLAM or online mapping systems lead to impressive results in the presence of simple captures, but they rely on a world-centric map representation as only a single neural field model is used. To define such a world-centric representation, accurate and static prior information about the scene, such as its boundaries and initial camera poses, are required. However, in real-time and on-the-fly scene capture applications, this prior knowledge cannot be assumed as fixed or static, since it dynamically changes and it is subject to significant updates based on run-time observations. Particularly in the context of large-scale mapping, significant camera pose drift is inevitable, necessitating the correction via loop closure. To overcome this limitation, we propose NEWTON, a view-centric mapping method that dynamically constructs neural fields based on run-time observation. In contrast to prior works, our method enables camera pose updates using loop closures and scene boundary updates by representing the scene with multiple neural fields, where each is defined in a local coordinate system of a selected keyframe. The experimental results demonstrate the superior performance of our method over existing world-centric neural field-based SLAM systems, in particular for large-scale scenes subject to camera pose updates.
翻译:基于神经场的3D表示方法最近已被广泛应用于包括SLAM系统在内的多个领域。当前的神经SLAM或在线映射系统在简单采集场景中取得了显著效果,但这些系统仅使用单一神经场模型,依赖于以世界为中心的映射表示。要定义这种世界中心表示,需要关于场景的精确静态先验信息,如场景边界和初始相机位姿。然而,在实时即时场景采集应用中,这种先验知识无法假定为固定或静态的,因为它会动态变化,并需要根据运行时观测进行显著更新。特别是在大规模映射场景中,显著的相机位姿漂移不可避免,需要通过闭环检测进行修正。为克服这一局限,我们提出NEWTON——一种基于运行时观测动态构建神经场的视图中心映射方法。与现有工作不同,我们的方法通过用多个神经场表示场景来实现相机位姿更新(利用闭环检测)和场景边界更新,其中每个神经场定义在选定关键帧的局部坐标系中。实验结果表明,与现有基于世界中心的神经场SLAM系统相比,我们的方法在需要相机位姿更新的大规模场景中具有更优越的性能。