Efficient detection and description of geometric regions in images is a prerequisite in visual systems for localization and mapping. Such systems still rely on traditional hand-crafted methods for efficient generation of lightweight descriptors, a common limitation of the more powerful neural network models that come with high compute and specific hardware requirements. In this paper, we focus on the adaptations required by detection and description neural networks to enable their use in computationally limited platforms such as robots, mobile, and augmented reality devices. To that end, we investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms. In addition, we revisit common practices in descriptor quantization and propose the use of a binary descriptor normalization layer, enabling the generation of distinctive binary descriptors with a constant number of ones. ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size, by at least an order of magnitude when compared to full-precision counterparts. These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization. Code and models are available at https://github.com/menelaoskanakis/ZippyPoint.
翻译:高效检测和描述图像中的几何区域是视觉定位与建图系统的前提。此类系统仍依赖传统手工方法生成轻量级描述符,而更强大的神经网络模型普遍存在计算量大、硬件需求高的局限。本文聚焦于检测与描述神经网络在机器人、移动设备和增强现实等计算受限平台上的适配改进。为此,我们研究并调整网络量化技术以加速推理,使其适用于计算受限平台。此外,我们重新审视描述符量化的常见实践,提出一种二元描述符归一化层,能够生成具有恒定数量1的独特二元描述符。我们的高效量化网络ZippyPoint结合二元描述符,与全精度对应方法相比,网络运行速度、描述符匹配速度及三维模型尺寸均提升至少一个数量级。这些改进在单应性估计、视觉定位和无地图视觉重定位任务中仅带来轻微性能损失。代码与模型已开源:https://github.com/menelaoskanakis/ZippyPoint。