We present Co-SLAM, a neural RGB-D SLAM system based on a hybrid representation, that performs robust camera tracking and high-fidelity surface reconstruction in real time. Co-SLAM represents the scene as a multi-resolution hash-grid to exploit its high convergence speed and ability to represent high-frequency local features. In addition, Co-SLAM incorporates one-blob encoding, to encourage surface coherence and completion in unobserved areas. This joint parametric-coordinate encoding enables real-time and robust performance by bringing the best of both worlds: fast convergence and surface hole filling. Moreover, our ray sampling strategy allows Co-SLAM to perform global bundle adjustment over all keyframes instead of requiring keyframe selection to maintain a small number of active keyframes as competing neural SLAM approaches do. Experimental results show that Co-SLAM runs at 10-17Hz and achieves state-of-the-art scene reconstruction results, and competitive tracking performance in various datasets and benchmarks (ScanNet, TUM, Replica, Synthetic RGBD). Project page: https://hengyiwang.github.io/projects/CoSLAM
翻译:我们提出Co-SLAM,一种基于混合表示的神经RGB-D SLAM系统,能够实时实现鲁棒的相机跟踪与高保真表面重建。Co-SLAM将场景表示为多分辨率哈希网格,以利用其高收敛速度和表征高频局部特征的能力。此外,Co-SLAM引入单团编码,以鼓励未观测区域的表面连贯性与完整性。这种联合参数-坐标编码融合了快速收敛与表面空洞填充的双重优势,从而实现实时且鲁棒的性能。同时,我们的光线采样策略使Co-SLAM能够对所有关键帧执行全局束调整,而无需像竞争性神经SLAM方法那样通过关键帧选择来维持少量活跃关键帧。实验结果表明,Co-SLAM以10-17Hz运行,在多个数据集与基准(ScanNet、TUM、Replica、合成RGBD)上实现了最先进的场景重建结果与具有竞争力的跟踪性能。项目页面:https://hengyiwang.github.io/projects/CoSLAM