Achieving robust and precise pose estimation in dynamic scenes is a significant research challenge in Visual Simultaneous Localization and Mapping (SLAM). Recent advancements integrating Gaussian Splatting into SLAM systems have proven effective in creating high-quality renderings using explicit 3D Gaussian models, significantly improving environmental reconstruction fidelity. However, these approaches depend on a static environment assumption and face challenges in dynamic environments due to inconsistent observations of geometry and photometry. To address this problem, we propose DG-SLAM, the first robust dynamic visual SLAM system grounded in 3D Gaussians, which provides precise camera pose estimation alongside high-fidelity reconstructions. Specifically, we propose effective strategies, including motion mask generation, adaptive Gaussian point management, and a hybrid camera tracking algorithm to improve the accuracy and robustness of pose estimation. Extensive experiments demonstrate that DG-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, and novel-view synthesis in dynamic scenes, outperforming existing methods meanwhile preserving real-time rendering ability.
翻译:在动态场景中实现鲁棒且精确的位姿估计是视觉同时定位与地图构建(SLAM)领域的一个重要研究挑战。近期将高斯溅射集成到SLAM系统中的进展已证明,利用显式3D高斯模型能够有效创建高质量渲染,显著提升了环境重建的保真度。然而,这些方法依赖于静态环境假设,在动态环境中由于几何与光度观测的不一致性而面临挑战。为解决此问题,我们提出了DG-SLAM,这是首个基于3D高斯的鲁棒动态视觉SLAM系统,能够在提供高保真重建的同时实现精确的相机位姿估计。具体而言,我们提出了包括运动掩码生成、自适应高斯点管理和混合相机跟踪算法在内的有效策略,以提高位姿估计的准确性与鲁棒性。大量实验表明,DG-SLAM在动态场景的相机位姿估计、地图重建和新视角合成方面均实现了最先进的性能,在保持实时渲染能力的同时超越了现有方法。