为超高速做好准备：使用事件相机实现亚毫秒级视觉地点识别 (Prepare for Warp Speed: Sub-millisecond Visual Place Recognition Using Event Cameras)

Visual Place Recognition (VPR) enables systems to identify previously visited locations within a map, a fundamental task for autonomous navigation. Prior works have developed VPR solutions using event cameras, which asynchronously measure per-pixel brightness changes with microsecond temporal resolution. However, these approaches rely on dense representations of the inherently sparse camera output and require tens to hundreds of milliseconds of event data to predict a place. Here, we break this paradigm with Flash, a lightweight VPR system that predicts places using sub-millisecond slices of event data. Our method is based on the observation that active pixel locations provide strong discriminative features for VPR. Flash encodes these active pixel locations using efficient binary frames and computes similarities via fast bitwise operations, which are then normalized based on the relative event activity in the query and reference frames. Flash improves Recall@1 for sub-millisecond VPR over existing baselines by 11.33x on the indoor QCR-Event-Dataset and 5.92x on the 8 km Brisbane-Event-VPR dataset. Moreover, our approach reduces the duration for which the robot must operate without awareness of its position, as evidenced by a localization latency metric we term Time to Correct Match (TCM). To the best of our knowledge, this is the first work to demonstrate sub-millisecond VPR using event cameras.

翻译：视觉地点识别（VPR）使系统能够在已知地图中识别先前访问过的位置，这是自主导航的一项基本任务。先前的研究已开发出使用事件相机的VPR解决方案，这类相机能以微秒级时间分辨率异步测量每个像素的亮度变化。然而，这些方法依赖于对本质上稀疏的相机输出进行密集表示，并且需要数十到数百毫秒的事件数据来预测地点。本文中，我们通过Flash系统打破了这一范式，这是一个轻量级VPR系统，仅使用亚毫秒级的事件数据切片即可预测地点。我们的方法基于以下观察：活跃像素位置为VPR提供了强大的判别性特征。Flash使用高效的二值帧对这些活跃像素位置进行编码，并通过快速的按位运算计算相似度，然后根据查询帧和参考帧中的相对事件活动度对相似度进行归一化。在室内QCR-Event-Dataset上，Flash将亚毫秒级VPR的Recall@1指标较现有基线提升了11.33倍；在8公里长的Brisbane-Event-VPR数据集上提升了5.92倍。此外，我们的方法减少了机器人必须在未知自身位置状态下运行的持续时间，这通过我们提出的定位延迟度量指标——正确匹配时间（TCM）得以证明。据我们所知，这是首个展示使用事件相机实现亚毫秒级VPR的研究工作。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日