This work presents an evaluation of CNN models and data augmentation to carry out the hierarchical localization of a mobile robot by using omnidireccional images. In this sense, an ablation study of different state-of-the-art CNN models used as backbone is presented and a variety of data augmentation visual effects are proposed for addressing the visual localization of the robot. The proposed method is based on the adaption and re-training of a CNN with a dual purpose: (1) to perform a rough localization step in which the model is used to predict the room from which an image was captured, and (2) to address the fine localization step, which consists in retrieving the most similar image of the visual map among those contained in the previously predicted room by means of a pairwise comparison between descriptors obtained from an intermediate layer of the CNN. In this sense, we evaluate the impact of different state-of-the-art CNN models such as ConvNeXt for addressing the proposed localization. Finally, a variety of data augmentation visual effects are separately employed for training the model and their impact is assessed. The performance of the resulting CNNs is evaluated under real operation conditions, including changes in the lighting conditions. Our code is publicly available on the project website https://github.com/juanjo-cabrera/IndoorLocalizationSingleCNN.git
翻译:本研究评估了利用全向图像进行移动机器人分层定位的CNN模型与数据增强技术。为此,我们开展了针对不同前沿CNN骨干模型的消融研究,并提出多种数据增强视觉效果以应对机器人的视觉定位问题。所提方法基于对CNN的适配与再训练,具有双重目标:(1) 执行粗略定位步骤,利用模型预测图像采集来源的房间;(2) 实现精细定位步骤,通过比较从CNN中间层提取的描述子进行成对匹配,在预测房间的视觉地图中检索最相似的图像。在此框架下,我们评估了包括ConvNeXt在内的多种前沿CNN模型对定位任务的影响。此外,研究分别采用多种数据增强视觉效果训练模型,并评估其效果。最终CNN模型在真实操作环境(包括光照条件变化)下的性能得到验证。项目代码已公开于https://github.com/juanjo-cabrera/IndoorLocalizationSingleCNN.git