We present a novel autonomous robot navigation algorithm for outdoor environments that is capable of handling diverse terrain traversability conditions. Our approach, VLM-GroNav, uses vision-language models (VLMs) and integrates them with physical grounding that is used to assess intrinsic terrain properties such as deformability and slipperiness. We use proprioceptive-based sensing, which provides direct measurements of these physical properties, and enhances the overall semantic understanding of the terrains. Our formulation uses in-context learning to ground the VLM's semantic understanding with proprioceptive data to allow dynamic updates of traversability estimates based on the robot's real-time physical interactions with the environment. We use the updated traversability estimations to inform both the local and global planners for real-time trajectory replanning. We validate our method on a legged robot (Ghost Vision 60) and a wheeled robot (Clearpath Husky), in diverse real-world outdoor environments with different deformable and slippery terrains. In practice, we observe significant improvements over state-of-the-art methods by up to 50% increase in navigation success rate.
翻译:本文提出一种适用于户外环境的新型自主机器人导航算法,该算法能够处理多样化的地形可通行性条件。我们的方法VLM-GroNav利用视觉-语言模型,并通过物理基础模块对其进行集成,该模块用于评估地形固有属性(如可变形性与滑移性)。我们采用基于本体感知的传感技术直接测量这些物理属性,从而增强对地形的整体语义理解。该框架通过上下文学习将VLM的语义理解与本体感知数据相融合,使得可通行性评估能根据机器人与环境的实时物理交互进行动态更新。更新后的可通行性评估结果将同时反馈至局部与全局规划器,以实现实时轨迹重规划。我们在腿式机器人(Ghost Vision 60)和轮式机器人(Clearpath Husky)上验证了该方法,测试场景涵盖具有不同可变形与滑移地形的多样化真实户外环境。实验结果表明,相较于现有最优方法,本方法将导航成功率最高提升达50%。