We introduce MUTE-SLAM, a real-time neural RGB-D SLAM system employing multiple tri-plane hash-encodings for efficient scene representation. MUTE-SLAM effectively tracks camera positions and incrementally builds a scalable multi-map representation for both small and large indoor environments. It dynamically allocates sub-maps for newly observed local regions, enabling constraint-free mapping without prior scene information. Unlike traditional grid-based methods, we use three orthogonal axis-aligned planes for hash-encoding scene properties, significantly reducing hash collisions and the number of trainable parameters. This hybrid approach not only speeds up convergence but also enhances the fidelity of surface reconstruction. Furthermore, our optimization strategy concurrently optimizes all sub-maps intersecting with the current camera frustum, ensuring global consistency. Extensive testing on both real-world and synthetic datasets has shown that MUTE-SLAM delivers state-of-the-art surface reconstruction quality and competitive tracking performance across diverse indoor settings. The code will be made public upon acceptance of the paper.
翻译:我们提出MUTE-SLAM,一种实时神经RGB-D SLAM系统,采用多个三平面哈希编码实现高效场景表示。MUTE-SLAM能够有效跟踪相机位置,并逐步构建适用于小型与大型室内环境的可扩展多地图表示。系统为新观测到的局部区域动态分配子地图,无需先验场景信息即可实现无约束建图。与传统基于网格的方法不同,我们使用三个正交轴对齐平面进行场景属性的哈希编码,显著减少哈希冲突及可训练参数数量。这种混合方法不仅加速了收敛过程,还提升了表面重建的保真度。此外,我们的优化策略同时优化与当前相机视锥相交的所有子地图,确保全局一致性。在真实与合成数据集上的大量测试表明,MUTE-SLAM在多种室内场景中达到了最先进的表面重建质量与具有竞争力的跟踪性能。论文接收后代码将公开。