Visual Place Recognition (VPR) is a critical task for performing global re-localization in visual perception systems. It requires the ability to accurately recognize a previously visited location under variations such as illumination, occlusion, appearance and viewpoint. In the case of robotic systems and augmented reality, the target devices for deployment are battery powered edge devices. Therefore whilst the accuracy of VPR methods is important so too is memory consumption and latency. Recently new works have focused on the recall@1 metric as a performance measure with limited focus on resource utilization. This has resulted in methods that use deep learning models too large to deploy on low powered edge devices. We hypothesize that these large models are highly over-parameterized and can be optimized to satisfy the constraints of a low powered embedded system whilst maintaining high recall performance. Our work studies the impact of compact convolutional network architecture design in combination with full-precision and mixed-precision post-training quantization on VPR performance. Importantly we not only measure performance via the recall@1 score but also measure memory consumption and latency. We characterize the design implications on memory, latency and recall scores and provide a number of design recommendations for VPR systems under these resource limitations.
翻译:视觉地点识别(VPR)是视觉感知系统中实现全局重定位的关键任务,要求能够在光照、遮挡、外观和视角等变化条件下准确识别之前访问过的位置。在机器人系统和增强现实应用中,目标部署设备通常是电池供电的边缘设备。因此,VPR方法的准确性固然重要,其内存消耗和延迟也同等关键。近期研究主要关注recall@1指标作为性能度量,对资源利用的考量有限,导致所采用的深度学习模型规模过大而无法部署在低功耗边缘设备上。我们假设这些大型模型存在严重的过参数化问题,可通过优化满足低功耗嵌入式系统的约束条件,同时保持高召回率性能。本研究系统分析了紧凑型卷积网络架构设计与全精度/混合精度训练后量化对VPR性能的影响。值得注意的是,我们不仅通过recall@1分数评估性能,还同时测量内存消耗和延迟。通过刻画内存、延迟与召回率之间的设计权衡,我们为资源受限场景下的VPR系统提出了多项设计建议。