Local positional graphs and attentive local features for a data and runtime-efficient hierarchical place recognition pipeline

Large-scale applications of Visual Place Recognition (VPR) require computationally efficient approaches. Further, a well-balanced combination of data-based and training-free approaches can decrease the required amount of training data and effort and can reduce the influence of distribution shifts between the training and application phases. This paper proposes a runtime and data-efficient hierarchical VPR pipeline that extends existing approaches and presents novel ideas. There are three main contributions: First, we propose Local Positional Graphs (LPG), a training-free and runtime-efficient approach to encode spatial context information of local image features. LPG can be combined with existing local feature detectors and descriptors and considerably improves the image-matching quality compared to existing techniques in our experiments. Second, we present Attentive Local SPED (ATLAS), an extension of our previous local features approach with an attention module that improves the feature quality while maintaining high data efficiency. The influence of the proposed modifications is evaluated in an extensive ablation study. Third, we present a hierarchical pipeline that exploits hyperdimensional computing to use the same local features as holistic HDC-descriptors for fast candidate selection and for candidate reranking. We combine all contributions in a runtime and data-efficient VPR pipeline that shows benefits over the state-of-the-art method Patch-NetVLAD on a large collection of standard place recognition datasets with 15$\%$ better performance in VPR accuracy, 54$\times$ faster feature comparison speed, and 55$\times$ less descriptor storage occupancy, making our method promising for real-world high-performance large-scale VPR in changing environments. Code will be made available with publication of this paper.

翻译：视觉地点识别（VPR）的大规模应用需要计算高效的方法。此外，基于数据的方法与免训练方法的均衡组合可减少所需训练数据量和训练工作量，并降低训练阶段与应用阶段之间分布偏移的影响。本文提出一种兼顾运行效率与数据效率的分层VPR流水线，既扩展了现有方法又提出了创新思路。主要贡献有三：首先，提出局部位置图（LPG）——一种免训练且运行高效的编码局部图像特征空间上下文信息的方法。LPG可与现有局部特征检测器及描述子结合，实验表明其显著提升了图像匹配质量。其次，提出注意力局部SPED（ATLAS）——在先前局部特征方法基础上增加注意力模块，在保持高数据效率的同时提升特征质量。通过详尽的消融实验评估了所提改进的影响。第三，设计一种利用超维度计算的分层流水线，将相同局部特征同时用作整体HDC描述符以实现快速候选选择与候选重排序。我们将所有贡献整合为运行高效且数据高效的VPR流水线，在大量标准地点识别数据集上展现出超越现有最优方法Patch-NetVLAD的优势：VPR准确率提升15%，特征比较速度提升54倍，描述符存储占用降低55倍，使该方法成为在动态环境下实现高性能大规模VPR的实用方案。论文发表后，相关代码将公开。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【WSDM2020】超越统计关系：将知识关系整合到多标签音乐风格分类的风格关联中（附pdf）

专知会员服务

18+阅读 · 2019年11月23日