We present VAPOR, a novel method for autonomous legged robot navigation in unstructured, densely vegetated outdoor environments using Offline Reinforcement Learning (RL). Our method trains a novel RL policy from unlabeled data collected in real outdoor vegetation. This policy uses height and intensity-based cost maps derived from 3D LiDAR point clouds, a goal cost map, and processed proprioception data as state inputs, and learns the physical and geometric properties of the surrounding vegetation such as height, density, and solidity/stiffness for navigation. Instead of using end-to-end policy actions, the fully-trained RL policy's Q network is used to evaluate dynamically feasible robot actions generated from a novel adaptive planner capable of navigating through dense narrow passages and preventing entrapment in vegetation such as tall grass and bushes. We demonstrate our method's capabilities on a legged robot in complex outdoor vegetation. We observe an improvement in success rates, a decrease in average power consumption, and decrease in normalized trajectory length compared to both existing end-to-end offline RL and outdoor navigation methods.
翻译:本文提出VAPOR方法,一种利用离线强化学习实现非结构化茂密植被户外环境中自主腿式机器人导航的新方法。该方法基于从真实户外植被中采集的无标签数据训练一种新型强化学习策略。该策略采用3D激光雷达点云导出的高度与强度代价地图、目标代价地图及处理后的本体感知数据作为状态输入,学习周边植被的物理与几何特性(如高度、密度、固态度/刚度)以辅助导航。不同于使用端到端策略动作,完全训练的强化学习策略的Q网络被用于评估由新型自适应规划器生成的动态可行机器人动作,该规划器能够穿越密集狭窄通道并防止陷入高草丛、灌木丛等植被。我们在复杂户外植被环境中通过腿式机器人验证了该方法的能力。与现有端到端离线强化学习及户外导航方法相比,该方法在成功率提升、平均功耗降低及归一化轨迹长度缩短方面均表现出优势。