Data Efficient Visual Place Recognition Using Extremely JPEG-Compressed Images

Visual Place Recognition (VPR) is the ability of a robotic platform to correctly interpret visual stimuli from its on-board cameras in order to determine whether it is currently located in a previously visited place, despite different viewpoint, illumination and appearance changes. JPEG is a widely used image compression standard that is capable of significantly reducing the size of an image at the cost of image clarity. For applications where several robotic platforms are simultaneously deployed, the visual data gathered must be transmitted remotely between each robot. Hence, JPEG compression can be employed to drastically reduce the amount of data transmitted over a communication channel, as working with limited bandwidth for VPR can be proven to be a challenging task. However, the effects of JPEG compression on the performance of current VPR techniques have not been previously studied. For this reason, this paper presents an in-depth study of JPEG compression in VPR related scenarios. We use a selection of well-established VPR techniques on well-established benchmark datasets with various amounts of compression applied. We show that by introducing compression, the VPR performance is drastically reduced, especially in the higher spectrum of compression. Moreover, this paper demonstrates how fine-tuning a CNN can be utilised as an optimisation method for JPEG compressed data to perform more consistently with the image transformations detected in extremely JPEG compressed images.

翻译：视觉位置识别（VPR）是指机器人平台能够正确解读其车载摄像头捕获的视觉刺激，从而判断当前是否处于先前访问过的位置，同时需应对视点、光照及外观变化等干扰。JPEG是一种广泛使用的图像压缩标准，能够以牺牲图像清晰度为代价显著减小图像体积。对于多机器人平台协同部署的应用场景，各机器人间需远程传输采集的视觉数据。因此，可采用JPEG压缩大幅减少通信信道中的数据传输量，因为处理有限带宽下的VPR被证明是一项具有挑战性的任务。然而，JPEG压缩对现有VPR技术性能的影响此前尚未被研究。为此，本文深入研究了JPEG压缩在VPR相关场景中的应用。我们采用一组成熟的VPR技术，在标准基准数据集上施加不同压缩程度进行测试。研究表明，引入压缩会显著降低VPR性能，尤其在高压缩比区间。此外，本文还论证了如何通过微调CNN作为JPEG压缩数据的优化方法，使其在极端JPEG压缩图像中检测到的图像变换下保持更一致的性能表现。

相关内容

声纹识别

关注 444

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日