This paper addresses Visual Place Recognition (VPR), which is essential for the safe navigation of mobile robots. The solution we propose employs panoramic images and deep learning models, which are fine-tuned with triplet loss functions that integrate curriculum learning strategies. By progressively presenting more challenging examples during training, these loss functions enable the model to learn more discriminative and robust feature representations, overcoming the limitations of conventional contrastive loss functions. After training, VPR is tackled in two steps: coarse (room retrieval) and fine (position estimation). The results demonstrate that the curriculum-based triplet losses consistently outperform standard contrastive loss functions, particularly under challenging perceptual conditions. To thoroughly assess the robustness and generalization capabilities of the proposed method, it is evaluated in a variety of indoor and outdoor environments. The approach is tested against common challenges in real operation conditions, including severe illumination changes, the presence of dynamic visual effects such as noise and occlusions, and scenarios with limited training data. The results show that the proposed framework performs competitively in all these situations, achieving high recognition accuracy and demonstrating its potential as a reliable solution for real-world robotic applications. The code used in the experiments is available at https://github.com/MarcosAlfaro/TripletNetworksIndoorLocalization.git.
翻译:本文针对移动机器人安全导航中的关键问题——视觉地点识别(VPR)展开研究。提出的解决方案采用全景图像与深度学习模型,并通过融合课程学习策略的三元组损失函数对模型进行微调。通过在训练过程中逐步提供更具挑战性的样本,这些损失函数使模型能够学习更具判别力与鲁棒性的特征表示,从而克服传统对比损失函数的局限性。训练完成后,VPR任务分两步处理:粗粒度(房间检索)与细粒度(位置估计)。实验结果表明,基于课程学习的三元组损失函数在各项测试中均稳定优于标准对比损失函数,尤其在具有挑战性的感知条件下优势显著。为全面评估所提方法的鲁棒性与泛化能力,研究在多种室内外环境中进行了系统性验证。该方法在真实操作条件下面临的常见挑战中接受了测试,包括剧烈光照变化、噪声与遮挡等动态视觉干扰,以及训练数据有限的场景。实验结果显示,所提框架在所有测试情境下均表现出竞争力,实现了高识别精度,证明了其作为现实机器人应用可靠解决方案的潜力。实验所用代码公开于 https://github.com/MarcosAlfaro/TripletNetworksIndoorLocalization.git。