Leveraging neural implicit representation to conduct dense RGB-D SLAM has been studied in recent years. However, this approach relies on a static environment assumption and does not work robustly within a dynamic environment due to the inconsistent observation of geometry and photometry. To address the challenges presented in dynamic environments, we propose a novel dynamic SLAM framework with neural radiance field. Specifically, we introduce a motion mask generation method to filter out the invalid sampled rays. This design effectively fuses the optical flow mask and semantic mask to enhance the precision of motion mask. To further improve the accuracy of pose estimation, we have designed a divide-and-conquer pose optimization algorithm that distinguishes between keyframes and non-keyframes. The proposed edge warp loss can effectively enhance the geometry constraints between adjacent frames. Extensive experiments are conducted on the two challenging datasets, and the results show that RoDyn-SLAM achieves state-of-the-art performance among recent neural RGB-D methods in both accuracy and robustness.
翻译:近年来,利用神经隐式表示进行稠密RGB-D SLAM的研究已得到广泛探索。然而,该方法依赖于静态环境假设,由于几何与光度观测的不一致性,其在动态环境中无法鲁棒地工作。为应对动态环境带来的挑战,我们提出了一种基于神经辐射场的新型动态SLAM框架。具体而言,我们引入了一种运动掩码生成方法以滤除无效的采样光线。该设计有效融合了光流掩码与语义掩码,从而提升了运动掩码的精度。为进一步提高位姿估计的准确性,我们设计了一种区分关键帧与非关键帧的分治式位姿优化算法。所提出的边缘扭曲损失能有效增强相邻帧间的几何约束。我们在两个具有挑战性的数据集上进行了大量实验,结果表明RoDyn-SLAM在精度与鲁棒性上均达到了当前神经RGB-D方法中的最优性能。