Recently the dense Simultaneous Localization and Mapping (SLAM) based on neural implicit representation has shown impressive progress in hole filling and high-fidelity mapping. Nevertheless, existing methods either heavily rely on known scene bounds or suffer inconsistent reconstruction due to drift in potential loop-closure regions, or both, which can be attributed to the inflexible representation and lack of local constraints. In this paper, we present LCP-Fusion, a neural implicit SLAM system with enhanced local constraints and computable prior, which takes the sparse voxel octree structure containing feature grids and SDF priors as hybrid scene representation, enabling the scalability and robustness during mapping and tracking. To enhance the local constraints, we propose a novel sliding window selection strategy based on visual overlap to address the loop-closure, and a practical warping loss to constrain relative poses. Moreover, we estimate SDF priors as coarse initialization for implicit features, which brings additional explicit constraints and robustness, especially when a light but efficient adaptive early ending is adopted. Experiments demonstrate that our method achieve better localization accuracy and reconstruction consistency than existing RGB-D implicit SLAM, especially in challenging real scenes (ScanNet) as well as self-captured scenes with unknown scene bounds. The code is available at https://github.com/laliwang/LCP-Fusion.
翻译:近年来,基于神经隐式表示的稠密同步定位与建图(SLAM)在空洞填补和高保真建图方面展现出显著进展。然而,现有方法要么严重依赖已知的场景边界,要么在潜在的闭环区域因漂移而导致重建不一致,或两者兼有。这主要归因于其表示形式不够灵活以及缺乏局部约束。本文提出LCP-Fusion,一种具有增强局部约束与可计算先验的神经隐式SLAM系统。该系统采用包含特征网格与符号距离函数(SDF)先验的稀疏体素八叉树结构作为混合场景表示,从而提升了建图与跟踪过程中的可扩展性与鲁棒性。为增强局部约束,我们提出一种基于视觉重叠的新型滑动窗口选择策略以处理闭环问题,并引入一种实用的扭曲损失来约束相对位姿。此外,我们通过估计SDF先验作为隐式特征的粗初始化,这带来了额外的显式约束和鲁棒性,尤其在采用轻量高效的自适应提前终止策略时效果更为显著。实验表明,与现有的RGB-D隐式SLAM方法相比,我们的方法在定位精度和重建一致性方面表现更优,尤其在具有挑战性的真实场景(ScanNet)以及场景边界未知的自采集场景中。代码发布于 https://github.com/laliwang/LCP-Fusion。