Real-time high-accuracy optical flow estimation is a crucial component in various applications, including localization and mapping in robotics, object tracking, and activity recognition in computer vision. While recent learning-based optical flow methods have achieved high accuracy, they often come with heavy computation costs. In this paper, we propose a highly efficient optical flow architecture, called NeuFlow, that addresses both high accuracy and computational cost concerns. The architecture follows a global-to-local scheme. Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy. We evaluate our approach on Jetson Orin Nano and RTX 2080 to demonstrate efficiency improvements across different computing platforms. We achieve a notable 10x-80x speedup compared to several state-of-the-art methods, while maintaining comparable accuracy. Our approach achieves around 30 FPS on edge computing platforms, which represents a significant breakthrough in deploying complex computer vision tasks such as SLAM on small robots like drones. The full training and evaluation code is available at https://github.com/neufieldrobotics/NeuFlow.
翻译:实时高精度光流估计是众多应用中的关键组成部分,包括机器人定位与建图、目标跟踪以及计算机视觉中的行为识别。尽管近期基于学习的光流方法已实现高精度,但它们往往伴随着高昂的计算成本。本文提出一种高效光流架构——NeuFlow,该架构同时兼顾高精度与计算成本问题。该架构遵循全局到局部的方案:给定输入图像在不同空间分辨率下提取的特征,首先采用全局匹配在1/16分辨率上估计初始光流以捕捉大位移,随后通过轻量级CNN层在1/8分辨率上进行精细化处理以提升精度。我们在Jetson Orin Nano和RTX 2080平台上评估了该方法,展示了其在不同计算平台上的效率提升。相较于多种现有先进方法,我们在保持相近精度的同时实现了10倍至80倍的显著加速。本方法在边缘计算平台上可达到约30 FPS,这标志着在无人机等小型机器人上部署SLAM等复杂计算机视觉任务取得了重大突破。完整训练和评估代码见https://github.com/neufieldrobotics/NeuFlow。