In real-time Visual SLAM systems, local mapping must operate under strict latency constraints, as delays degrade map quality and increase the risk of tracking failure. GPU parallelization offers a promising way to reduce latency. However, parallelizing local mapping is challenging due to synchronized shared-state updates and the overhead of transferring large map data structures to the GPU. This paper presents TurboMap, a GPU-parallelized and CPU-optimized local mapping backend that holistically addresses these challenges. We restructure Map Point Creation to enable parallel Keypoint Correspondence Search on the GPU, redesign and parallelize Map Point Fusion, optimize Redundant Keyframe Culling on the CPU, and integrate a fast GPU-based Local Bundle Adjustment solver. To minimize data transfer and synchronization costs, we introduce persistent GPU-resident keyframe storage. Experiments on the EuRoC and TUM-VI datasets show average local mapping speedups of 1.3x and 1.6x, respectively, while preserving accuracy.
翻译:在实时视觉SLAM系统中,局部建图必须在严格的延迟约束下运行,因为延迟会降低地图质量并增加跟踪失败的风险。GPU并行化为降低延迟提供了一种有前景的途径。然而,由于同步的共享状态更新以及将大型地图数据结构传输到GPU的开销,并行化局部建图具有挑战性。本文提出了TurboMap,一个GPU并行化且CPU优化的局部建图后端,它整体性地解决了这些挑战。我们重构了地图点创建过程,以支持在GPU上并行进行关键点对应搜索;重新设计并并行化了地图点融合;在CPU上优化了冗余关键帧剔除;并集成了一个快速的基于GPU的局部光束法平差求解器。为了最小化数据传输和同步成本,我们引入了持久化的GPU驻留关键帧存储。在EuRoC和TUM-VI数据集上的实验表明,在保持精度的同时,局部建图速度平均分别提升了1.3倍和1.6倍。