We present a novel self-supervised algorithm named MotionHint for monocular visual odometry (VO) that takes motion constraints into account. A key aspect of our approach is to use an appropriate motion model that can help existing self-supervised monocular VO (SSM-VO) algorithms to overcome issues related to the local minima within their self-supervised loss functions. The motion model is expressed with a neural network named PPnet. It is trained to coarsely predict the next pose of the camera and the uncertainty of this prediction. Our self-supervised approach combines the original loss and the motion loss, which is the weighted difference between the prediction and the generated ego-motion. Taking two existing SSM-VO systems as our baseline, we evaluate our MotionHint algorithm on the standard KITTI benchmark. Experimental results show that our MotionHint algorithm can be easily applied to existing open-sourced state-of-the-art SSM-VO systems to greatly improve the performance by reducing the resulting ATE by up to 28.73%.
翻译:本文提出了一种新颖的自监督单目视觉里程计算法MotionHint,该算法将运动约束纳入考量。我们方法的一个关键方面是采用了一个合适的运动模型,该模型能够帮助现有的自监督单目视觉里程计算法克服其自监督损失函数中局部极小值相关的问题。该运动模型通过一个名为PPnet的神经网络来表达。该网络经过训练,能够粗略预测相机的下一姿态及其预测的不确定性。我们的自监督方法结合了原始损失和运动损失,其中运动损失是预测值与生成的自身运动之间的加权差值。以两个现有的自监督单目视觉里程计系统作为基线,我们在标准的KITTI基准上评估了我们的MotionHint算法。实验结果表明,我们的MotionHint算法可以轻松应用于现有的开源先进自监督单目视觉里程计系统,通过将绝对轨迹误差降低高达28.73%,从而显著提升其性能。