Visual localization is a fundamental task for a wide range of applications in the field of robotics. Yet, it is still a complex problem with no universal solution, and the existing approaches are difficult to scale: most state-of-the-art solutions are unable to provide accurate localization without a significant amount of storage space. We propose a hierarchical, low-memory approach to localization based on keypoints with different descriptor lengths. It becomes possible with the use of the developed unsupervised neural network, which predicts a feature pyramid with different descriptor lengths for images. This structure allows applying coarse-to-fine paradigms for localization based on keypoint map, and varying the accuracy of localization by changing the type of the descriptors used in the pipeline. Our approach achieves comparable results in localization accuracy and a significant reduction in memory consumption (up to 16 times) among state-of-the-art methods.
翻译:视觉定位是机器人领域诸多应用中的基础性任务。然而,其仍是一个尚无通用解决方案的复杂问题,现有方法难以扩展:大多数最先进的解决方案无法在无需大量存储空间的情况下提供精确定位。我们提出一种基于不同描述子长度的关键点的层次化、低内存定位方法。该方法得益于所开发的无需监督的神经网络,能够为图像预测具有不同描述子长度的特征金字塔。这一结构使得基于关键点地图的定位可应用由粗到精的范式,并通过改变流程中使用的描述子类型来调整定位精度。我们的方法在定位精度上达到与现有最先进方法相当的效果,同时大幅减少内存消耗(最高可降低16倍)。