Ghost-dil-NetVLAD: A Lightweight Neural Network for Visual Place Recognition

Visual place recognition (VPR) is a challenging task with the unbalance between enormous computational cost and high recognition performance. Thanks to the practical feature extraction ability of the lightweight convolution neural networks (CNNs) and the train-ability of the vector of locally aggregated descriptors (VLAD) layer, we propose a lightweight weakly supervised end-to-end neural network consisting of a front-ended perception model called GhostCNN and a learnable VLAD layer as a back-end. GhostCNN is based on Ghost modules that are lightweight CNN-based architectures. They can generate redundant feature maps using linear operations instead of the traditional convolution process, making a good trade-off between computation resources and recognition accuracy. To enhance our proposed lightweight model further, we add dilated convolutions to the Ghost module to get features containing more spatial semantic information, improving accuracy. Finally, rich experiments conducted on a commonly used public benchmark and our private dataset validate that the proposed neural network reduces the FLOPs and parameters of VGG16-NetVLAD by 99.04% and 80.16%, respectively. Besides, both models achieve similar accuracy.

翻译：视觉地点识别（VPR）是一项在巨大计算开销与高识别性能之间寻求平衡的挑战性任务。得益于轻量级卷积神经网络（CNN）高效的实用特征提取能力，以及局部聚合描述符向量（VLAD）层的可训练特性，我们提出了一种轻量级弱监督端到端神经网络，其前端采用名为GhostCNN的感知模型，后端采用可学习的VLAD层。GhostCNN基于Ghost模块（一种轻量级CNN架构），通过线性运算而非传统卷积过程生成冗余特征图，从而在计算资源与识别精度之间实现良好权衡。为进一步增强该轻量级模型，我们在Ghost模块中引入膨胀卷积，以获取包含更多空间语义信息的特征，从而提升精度。最后，在公共基准数据集与私有数据集上的大量实验表明，所提出的神经网络相较于VGG16-NetVLAD，FLOPs与参数量分别降低99.04%与80.16%，同时两者达到相近的识别精度。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日