GigaSLAM：基于分层高斯泼溅的大规模单目SLAM (GigaSLAM: Large-Scale Monocular SLAM with Hierarchical Gaussian Splats)

Tracking and mapping in large-scale, unbounded outdoor environments using only monocular RGB input presents substantial challenges for existing SLAM systems. Traditional Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) SLAM methods are typically limited to small, bounded indoor settings. To overcome these challenges, we introduce GigaSLAM, the first RGB NeRF / 3DGS-based SLAM framework for kilometer-scale outdoor environments, as demonstrated on the KITTI, KITTI 360, 4 Seasons and A2D2 datasets. Our approach employs a hierarchical sparse voxel map representation, where Gaussians are decoded by neural networks at multiple levels of detail. This design enables efficient, scalable mapping and high-fidelity viewpoint rendering across expansive, unbounded scenes. For front-end tracking, GigaSLAM utilizes a metric depth model combined with epipolar geometry and PnP algorithms to accurately estimate poses, while incorporating a Bag-of-Words-based loop closure mechanism to maintain robust alignment over long trajectories. Consequently, GigaSLAM delivers high-precision tracking and visually faithful rendering on urban outdoor benchmarks, establishing a robust SLAM solution for large-scale, long-term scenarios, and significantly extending the applicability of Gaussian Splatting SLAM systems to unbounded outdoor environments. GitHub: https://github.com/DengKaiCQ/GigaSLAM.

翻译：仅使用单目RGB输入在大型、无边界室外环境中进行跟踪与建图，对现有SLAM系统构成了重大挑战。传统的神经辐射场（NeRF）与三维高斯泼溅（3DGS）SLAM方法通常局限于小型、有界的室内场景。为克服这些挑战，我们提出了GigaSLAM——首个面向千米级室外环境的基于RGB NeRF/3DGS的SLAM框架，并在KITTI、KITTI 360、4 Seasons和A2D2数据集上进行了验证。我们的方法采用分层稀疏体素地图表示，其中高斯特征由神经网络在多个细节层级解码生成。该设计实现了在广阔无边界场景中高效、可扩展的建图与高保真视点渲染。在前端跟踪方面，GigaSLAM结合度量深度模型、对极几何与PnP算法进行精确位姿估计，同时引入基于词袋模型的回环检测机制以维持长轨迹的鲁棒对齐。因此，GigaSLAM在城市室外基准测试中实现了高精度跟踪与视觉逼真的渲染，为大规模长期场景提供了鲁棒的SLAM解决方案，显著拓展了高斯泼溅SLAM系统在无边界室外环境中的适用性。GitHub：https://github.com/DengKaiCQ/GigaSLAM。