In recent years, there have been significant advancements in 3D reconstruction and dense RGB-D SLAM systems. One notable development is the application of Neural Radiance Fields (NeRF) in these systems, which utilizes implicit neural representation to encode 3D scenes. This extension of NeRF to SLAM has shown promising results. However, the depth images obtained from consumer-grade RGB-D sensors are often sparse and noisy, which poses significant challenges for 3D reconstruction and affects the accuracy of the representation of the scene geometry. Moreover, the original hierarchical feature grid with occupancy value is inaccurate for scene geometry representation. Furthermore, the existing methods select random pixels for camera tracking, which leads to inaccurate localization and is not robust in real-world indoor environments. To this end, we present NeSLAM, an advanced framework that achieves accurate and dense depth estimation, robust camera tracking, and realistic synthesis of novel views. First, a depth completion and denoising network is designed to provide dense geometry prior and guide the neural implicit representation optimization. Second, the occupancy scene representation is replaced with Signed Distance Field (SDF) hierarchical scene representation for high-quality reconstruction and view synthesis. Furthermore, we also propose a NeRF-based self-supervised feature tracking algorithm for robust real-time tracking. Experiments on various indoor datasets demonstrate the effectiveness and accuracy of the system in reconstruction, tracking quality, and novel view synthesis.
翻译:近年来,三维重建与密集RGB-D SLAM系统取得了显著进展。其中一项重要发展是将神经辐射场(NeRF)应用于此类系统,利用隐式神经表示对三维场景进行编码。将NeRF扩展至SLAM的研究已展现出良好的应用前景。然而,消费级RGB-D传感器获取的深度图像通常存在稀疏性与噪声问题,这给三维重建带来了严峻挑战,并直接影响场景几何表示的精度。此外,原始的基于占用值的层次化特征网格在场景几何表示方面存在不准确性。现有方法在相机跟踪中采用随机像素选择策略,导致室内真实环境中的定位精度不足且鲁棒性欠佳。为此,我们提出NeSLAM框架——一种能够实现精确密集深度估计、鲁棒相机跟踪及高质量新视角合成的先进系统。首先,设计深度补全与去噪网络以提供密集几何先验,引导隐式神经表示的优化过程。其次,采用有符号距离场(SDF)层次化场景表示替代占用值表示,以实现高质量重建与视图合成。进一步地,提出基于NeRF的自监督特征跟踪算法,保障实时跟踪的鲁棒性。在多个室内数据集上的实验表明,本系统在重建精度、跟踪质量及新视角合成方面均具有显著效果与准确性。