Real-time 3D Gaussian splatting (3DGS)-based Simultaneous Localization and Mapping (SLAM) in large-scale real-world environments remains challenging, as existing methods often struggle to jointly achieve low-latency pose estimation, 3D Gaussian reconstruction in step with incoming sensor streams, and long-term global consistency. In this paper, we present a tightly coupled LiDAR-Inertial-Visual (LIV) 3DGS-based SLAM framework for real-time pose estimation and photorealistic mapping in large-scale real-world scenes. The system executes state estimation and 3D Gaussian primitive initialization in parallel with global Gaussian optimization, thereby enabling continuous dense mapping. To improve Gaussian initialization quality and accelerate optimization convergence, we introduce a cascaded strategy that combines feed-forward predictions with voxel-based principal component analysis (voxel-PCA) geometric priors. To enhance global consistency in large scenes, we further perform loop closure directly on the optimized global Gaussian map by estimating loop constraints through Gaussian-based Generalized Iterative Closest Point (GICP) registration, followed by pose-graph optimization. In addition, we collected challenging large-scale looped outdoor SLAM sequences with hardware-synchronized LiDAR-camera-IMU and ground-truth trajectories to support realistic and comprehensive evaluation. Extensive experiments on both public datasets and our dataset demonstrate that the proposed method achieves a strong balance among real-time efficiency, localization accuracy, and rendering quality across diverse and challenging real-world scenes.
翻译:面向大规模真实环境的实时三维高斯溅射同步定位与地图构建(SLAM)仍具挑战性,现有方法难以在低延迟位姿估计、随传感器数据流同步的三维高斯重建以及长期全局一致性之间取得平衡。本文提出一种紧耦合激光雷达-惯性-视觉(LIV)的三维高斯溅射SLAM框架,用于大规模真实场景中的实时位姿估计与照片级真实感地图构建。系统并行执行状态估计与三维高斯图元初始化,并配合全局高斯优化,从而实现连续稠密建图。为提升高斯初始化质量并加速优化收敛,我们引入一种级联策略,将前馈预测与基于体素的主成分分析(voxel-PCA)几何先验相结合。为增强大规模场景的全局一致性,进一步通过基于高斯的广义迭代最近点(GICP)配准估计闭环约束,直接在优化后的全局高斯图上执行闭环检测,并配合位姿图优化。此外,我们采集了具有硬件同步的激光雷达-相机-惯性测量单元(IMU)及真值轨迹的大规模环状室外SLAM序列,以支持真实全面的评估。在公开数据集及自采数据集上的大量实验表明,所提方法在多样且具有挑战性的真实场景中,在实时效率、定位精度与渲染质量之间取得了良好的平衡。