This paper presents LiteVLoc, a hierarchical visual localization framework that uses a lightweight topo-metric map to represent the environment. The method consists of three sequential modules that estimate camera poses in a coarse-to-fine manner. Unlike mainstream approaches relying on detailed 3D representations, LiteVLoc reduces storage overhead by leveraging learning-based feature matching and geometric solvers for metric pose estimation. A novel dataset for the map-free relocalization task is also introduced. Extensive experiments including localization and navigation in both simulated and real-world scenarios have validate the system's performance and demonstrated its precision and efficiency for large-scale deployment. Code and data will be made publicly available.
翻译:本文提出LiteVLoc,一种采用轻量化拓扑-度量地图表示环境的层次化视觉定位框架。该方法包含三个顺序模块,以由粗到精的方式估计相机位姿。与依赖精细三维表示的主流方法不同,LiteVLoc通过基于学习的特征匹配与几何求解器进行度量位姿估计,从而降低存储开销。本文还针对无地图重定位任务提出了新的数据集。在仿真与真实场景中进行的定位与导航综合实验验证了系统性能,证明其在大规模部署中兼具精度与效率。代码与数据将公开提供。