Thermal cameras offer strong potential for robot perception under challenging illumination and weather conditions. However, thermal Simultaneous Localization and Mapping (SLAM) remains difficult due to unreliable feature extraction, unstable motion tracking, and inconsistent global pose and map construction, particularly in dynamic large-scale outdoor environments. To address these challenges, we propose LST-SLAM, a novel large-scale stereo thermal SLAM system that achieves robust performance in complex, dynamic scenes. Our approach combines self-supervised thermal feature learning, stereo dual-level motion tracking, and geometric pose optimization. We also introduce a semantic-geometric hybrid constraint that suppresses potentially dynamic features lacking strong inter-frame geometric consistency. Furthermore, we develop an online incremental bag-of-words model for loop closure detection, coupled with global pose optimization to mitigate accumulated drift. Extensive experiments on kilometer-scale dynamic thermal datasets show that LST-SLAM significantly outperforms recent representative SLAM systems, including AirSLAM and DROID-SLAM, in both robustness and accuracy.
翻译:热成像相机在挑战性光照与天气条件下为机器人感知提供了巨大潜力。然而,由于特征提取不可靠、运动跟踪不稳定以及全局位姿与地图构建不一致,热成像同步定位与建图(SLAM)仍然面临困难,尤其是在动态的大规模户外环境中。为应对这些挑战,我们提出了LST-SLAM,一种新颖的大规模立体热成像SLAM系统,能够在复杂的动态场景中实现鲁棒性能。我们的方法结合了自监督热特征学习、立体双级运动跟踪与几何位姿优化。我们还引入了一种语义-几何混合约束,以抑制缺乏强帧间几何一致性的潜在动态特征。此外,我们开发了一种用于闭环检测的在线增量词袋模型,并结合全局位姿优化以减轻累积漂移。在千米级动态热成像数据集上的大量实验表明,LST-SLAM在鲁棒性与准确性方面均显著优于近期代表性SLAM系统,包括AirSLAM与DROID-SLAM。