The integration of neural rendering and the SLAM system recently showed promising results in joint localization and photorealistic view reconstruction. However, existing methods, fully relying on implicit representations, are so resource-hungry that they cannot run on portable devices, which deviates from the original intention of SLAM. In this paper, we present Photo-SLAM, a novel SLAM framework with a hyper primitives map. Specifically, we simultaneously exploit explicit geometric features for localization and learn implicit photometric features to represent the texture information of the observed environment. In addition to actively densifying hyper primitives based on geometric features, we further introduce a Gaussian-Pyramid-based training method to progressively learn multi-level features, enhancing photorealistic mapping performance. The extensive experiments with monocular, stereo, and RGB-D datasets prove that our proposed system Photo-SLAM significantly outperforms current state-of-the-art SLAM systems for online photorealistic mapping, e.g., PSNR is 30% higher and rendering speed is hundreds of times faster in the Replica dataset. Moreover, the Photo-SLAM can run at real-time speed using an embedded platform such as Jetson AGX Orin, showing the potential of robotics applications.
翻译:神经渲染与SLAM系统的结合最近在联合定位与光度真实感视图重建方面展现出良好前景。然而,现有方法完全依赖隐式表示,资源消耗严重,无法在便携设备上运行,这偏离了SLAM的初衷。本文提出Photo-SLAM,一种基于超基元地图的新型SLAM框架。具体地,我们同步利用显式几何特征进行定位,并学习隐式光度特征以表示观测环境的纹理信息。除了基于几何特征主动稠密化超基元外,我们进一步引入基于高斯金字塔的训练方法逐步学习多层级特征,从而增强光度真实感建图性能。在单目、立体和RGB-D数据集上的大量实验证明,我们提出的Photo-SLAM系统在在线光度真实感建图方面显著优于当前最先进的SLAM系统,例如在Replica数据集上PSNR提升30%,渲染速度提升数百倍。此外,Photo-SLAM可在Jetson AGX Orin等嵌入式平台上实现实时运行,展现出机器人应用的潜力。