Natural environments such as forests and grasslands are challenging for robotic navigation because of the false perception of rigid obstacles from high grass, twigs, or bushes. In this work, we propose Wild Visual Navigation (WVN), an online self-supervised learning system for traversability estimation which uses only vision. The system is able to continuously adapt from a short human demonstration in the field. It leverages high-dimensional features from self-supervised visual transformer models, with an online scheme for supervision generation that runs in real-time on the robot. We demonstrate the advantages of our approach with experiments and ablation studies in challenging environments in forests, parks, and grasslands. Our system is able to bootstrap the traversable terrain segmentation in less than 5 min of in-field training time, enabling the robot to navigate in complex outdoor terrains - negotiating obstacles in high grass as well as a 1.4 km footpath following. While our experiments were executed with a quadruped robot, ANYmal, the approach presented can generalize to any ground robot.
翻译:自然环境中(如森林和草地)因高草丛、细枝或灌木丛对刚性障碍物的误感知,给机器人导航带来挑战。本文提出“野外视觉导航”(Wild Visual Navigation,WVN)——一种仅依赖视觉的在线自监督学习系统,用于可通行性估计。该系统能够通过现场短时人类演示持续自适应,利用自监督视觉Transformer模型的高维特征,并配合实时运行的在线监督生成方案在机器人上运行。我们通过在森林、公园和草地等复杂环境中的实验和消融研究,证明了该方法优势。该系统可在现场训练不到5分钟内启动可通行地形分割,使机器人能在复杂户外地形中导航——既能在高草丛中跨越障碍,也能完成1.4公里步道跟踪。尽管实验采用四足机器人ANYmal执行,但该方法可泛化至任何地面机器人。