We introduce an integrated precise LiDAR, Inertial, and Visual (LIV) multimodal sensor fused mapping system that builds on the differentiable \pre{surface splatting }\now{Gaussians} to improve the mapping fidelity, quality, and structural accuracy. Notably, this is also a novel form of tightly coupled map for LiDAR-visual-inertial sensor fusion. This system leverages the complementary characteristics of LiDAR and visual data to capture the geometric structures of large-scale 3D scenes and restore their visual surface information with high fidelity. The initialization for the scene's surface Gaussians and the sensor's poses of each frame are obtained using a LiDAR-inertial system with the feature of size-adaptive voxels. Then, we optimized and refined the Gaussians using visual-derived photometric gradients to optimize their quality and density. Our method is compatible with various types of LiDAR, including solid-state and mechanical LiDAR, supporting both repetitive and non-repetitive scanning modes. Bolstering structure construction through LiDAR and facilitating real-time generation of photorealistic renderings across diverse LIV datasets. It showcases notable resilience and versatility in generating real-time photorealistic scenes potentially for digital twins and virtual reality, while also holding potential applicability in real-time SLAM and robotics domains. We release our software and hardware and self-collected datasets to benefit the community.
翻译:我们提出了一种集成精确激光雷达、惯性测量单元与视觉(LIV)多模态传感器融合的建图系统,该方法基于可微分的表面高斯泼溅技术,旨在提升建图的保真度、质量与结构精度。值得注意的是,这也是一种面向激光雷达-视觉-惯性传感器融合的紧耦合地图新范式。该系统利用激光雷达与视觉数据的互补特性,捕捉大规模三维场景的几何结构,并高保真度地恢复其视觉表面信息。通过采用具备尺寸自适应体素特征的激光雷达-惯性系统,完成场景表面高斯原语的初始化与每帧传感器位姿估计。随后,我们利用视觉导出的光度梯度对高斯原语进行优化与细化,以提升其质量与密度。该方法兼容包括固态与机械式激光雷达在内的多种类型,支持重复与非重复扫描模式。通过激光雷达强化结构构建,可在多样化LIV数据集上实现实时生成逼真渲染效果的能力。其在生成实时逼真场景方面展现出显著鲁棒性与多能性,可潜在应用于数字孪生与虚拟现实领域,同时在实时SLAM及机器人领域亦具有应用前景。我们已开源相关软硬件资源与自采数据集,以回馈学术界。