Natural environments such as forests and grasslands are challenging for robotic navigation because of the false perception of rigid obstacles from high grass, twigs, or bushes. In this work, we propose Wild Visual Navigation (WVN), an online self-supervised learning system for traversability estimation which uses only vision. The system is able to continuously adapt from a short human demonstration in the field. It leverages high-dimensional features from self-supervised visual transformer models, with an online scheme for supervision generation that runs in real-time on the robot. We demonstrate the advantages of our approach with experiments and ablation studies in challenging environments in forests, parks, and grasslands. Our system is able to bootstrap the traversable terrain segmentation in less than 5 min of in-field training time, enabling the robot to navigate in complex outdoor terrains - negotiating obstacles in high grass as well as a 1.4 km footpath following. While our experiments were executed with a quadruped robot, ANYmal, the approach presented can generalize to any ground robot.
翻译:自然环境,如森林和草地,对机器人导航构成挑战,因为高草、细枝或灌木丛等会错误地感知为刚性障碍物。本文提出野外视觉导航(WVN),一种仅利用视觉的在线自监督学习系统,用于估计地形可通行性。该系统能够通过田野中短暂的人类演示持续适应。它利用自监督视觉Transformer模型的高维特征,结合在机器人上实时运行的在线监督生成方案。我们通过在森林、公园和草地等挑战性环境中的实验和消融研究展示了该方法的优势。系统能够在不到5分钟的现场训练时间内完成可通行地形分割的引导启动,使机器人能够在复杂户外地形中导航——既能在高草丛中绕开障碍物,也能沿1.4公里的小径行进。虽然我们的实验是在四足机器人ANYmal上执行的,但所提出的方法可推广至任何地面机器人。