We propose Deep Patch Visual Odometry (DPVO), a new deep learning system for monocular Visual Odometry (VO). DPVO uses a novel recurrent network architecture designed for tracking image patches across time. Recent approaches to VO have significantly improved the state-of-the-art accuracy by using deep networks to predict dense flow between video frames. However, using dense flow incurs a large computational cost, making these previous methods impractical for many use cases. Despite this, it has been assumed that dense flow is important as it provides additional redundancy against incorrect matches. DPVO disproves this assumption, showing that it is possible to get the best accuracy and efficiency by exploiting the advantages of sparse patch-based matching over dense flow. DPVO introduces a novel recurrent update operator for patch based correspondence coupled with differentiable bundle adjustment. On Standard benchmarks, DPVO outperforms all prior work, including the learning-based state-of-the-art VO-system (DROID) using a third of the memory while running 3x faster on average. Code is available at https://github.com/princeton-vl/DPVO
翻译:我们提出了深度补丁视觉里程计(DPVO),一种用于单目视觉里程计(VO)的新型深度学习系统。DPVO采用了一种新颖的循环网络架构,专门设计用于跨时间追踪图像补丁。最近的VO方法通过使用深度网络预测视频帧之间的密集光流,显著提升了最先进的精度。然而,使用密集光流会带来巨大的计算成本,使得这些先前的方法在许多应用场景中不切实际。尽管如此,人们一直认为密集光流至关重要,因为它能为错误匹配提供额外的冗余。DPVO推翻了这一假设,表明通过利用稀疏补丁匹配相对于密集光流的优势,可以同时获得最佳精度和效率。DPVO引入了一种新颖的循环更新算子,用于补丁对应关系,并结合了可微分的束调整。在标准基准测试中,DPVO在所有先前工作中表现最优,包括基于学习的最先进VO系统(DROID),同时内存使用仅为其三分之一,平均运行速度快三倍。代码可访问https://github.com/princeton-vl/DPVO获取。