Monocular omnidirectional visual odometry (OVO) systems leverage 360-degree cameras to overcome field-of-view limitations of perspective VO systems. However, existing methods, reliant on handcrafted features or photometric objectives, often lack robustness in challenging scenarios, such as aggressive motion and varying illumination. To address this, we present 360DVO, the first deep learning-based OVO framework. Our approach introduces a distortion-aware spherical feature extractor (DAS-Feat) that adaptively learns distortion-resistant features from 360-degree images. These sparse feature patches are then used to establish constraints for effective pose estimation within a novel omnidirectional differentiable bundle adjustment (ODBA) module. To facilitate evaluation in realistic settings, we also contribute a new real-world OVO benchmark. Extensive experiments on this benchmark and public synthetic datasets (TartanAir V2 and 360VO) demonstrate that 360DVO surpasses state-of-the-art baselines (including 360VO and OpenVSLAM), improving robustness by 50% and accuracy by 37.5%. Homepage: https://chris1004336379.github.io/360DVO-homepage
翻译:单目全景视觉里程计(OVO)系统利用360度相机克服了透视VO系统的视场限制。然而,现有方法依赖于手工特征或光度目标,在剧烈运动和光照变化等挑战性场景中往往缺乏鲁棒性。为此,我们提出了360DVO,首个基于深度学习的OVO框架。我们的方法引入了一种感知畸变的球面特征提取器(DAS-Feat),能够自适应地从360度图像中学习抗畸变特征。这些稀疏特征块随后被用于在一个新颖的全景可微分光束法平差(ODBA)模块中建立有效位姿估计的约束。为了促进真实场景下的评估,我们还贡献了一个新的真实世界OVO基准数据集。在该基准及公开合成数据集(TartanAir V2和360VO)上的大量实验表明,360DVO超越了现有最先进的基线方法(包括360VO和OpenVSLAM),将鲁棒性提升了50%,精度提高了37.5%。项目主页:https://chris1004336379.github.io/360DVO-homepage