Neural implicit representations have recently demonstrated compelling results on dense Simultaneous Localization And Mapping (SLAM) but suffer from the accumulation of errors in camera tracking and distortion in the reconstruction. Purposely, we present GO-SLAM, a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time. Robust pose estimation is at its core, supported by efficient loop closing and online full bundle adjustment, which optimize per frame by utilizing the learned global geometry of the complete history of input frames. Simultaneously, we update the implicit and continuous surface representation on-the-fly to ensure global consistency of 3D reconstruction. Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy. Furthermore, GO-SLAM is versatile and can run with monocular, stereo, and RGB-D input.
翻译:神经隐式表示近年来在稠密同步定位与地图构建(SLAM)领域展现出令人瞩目的成果,但存在相机跟踪误差累积与重建畸变问题。为此,我们提出GO-SLAM——一种基于深度学习的稠密视觉SLAM框架,能够实时全局优化位姿与三维重建。其核心在于鲁棒的位姿估计,通过高效闭环检测与在线全光束法平差(利用输入帧完整历史时域学习到的全局几何结构逐帧优化)提供支撑。同时,我们实时更新隐式连续表面表示,确保三维重建的全局一致性。在多种合成与真实世界数据集上的实验表明,GO-SLAM在跟踪鲁棒性与重建精度上均超越现有最优方法。此外,GO-SLAM具备通用性,可支持单目、立体及RGB-D输入。