Recent work has shown impressive localization performance using only images of ground textures taken with a downward facing monocular camera. This provides a reliable navigation method that is robust to feature sparse environments and challenging lighting conditions. However, these localization methods require an existing map for comparison. Our work aims to relax the need for a map by introducing a full simultaneous localization and mapping (SLAM) system. By not requiring an existing map, setup times are minimized and the system is more robust to changing environments. This SLAM system uses a combination of several techniques to accomplish this. Image keypoints are identified and projected into the ground plane. These keypoints, visual bags of words, and several threshold parameters are then used to identify overlapping images and revisited areas. The system then uses robust M-estimators to estimate the transform between robot poses with overlapping images and revisited areas. These optimized estimates make up the map used for navigation. We show, through experimental data, that this system performs reliably on many ground textures, but not all.
翻译:近期研究表明,仅利用向下单目摄像头拍摄的地面纹理图像即可实现令人瞩目的定位性能。该方法提供了一种可靠的导航方案,对特征稀疏环境与复杂光照条件具有鲁棒性。然而,这类定位方法需依赖现有地图进行比对。本研究旨在通过构建完整的SLAM系统来消除对预存地图的需求。由于无需预存地图,系统部署时间得以最小化,且对环境变化的鲁棒性更强。该SLAM系统融合多项技术实现上述目标:首先识别图像关键点并投影至地平面,随后结合视觉词袋模型与多项阈值参数识别重叠图像与回环区域,进而采用鲁棒M估计器计算机器人位姿间的变换。这些优化后的位姿估计构成导航所需的地图。实验数据表明,该系统在多数(而非全部)地面纹理场景中均能可靠运行。