Efficiently training quadruped robot navigation in densely cluttered environments remains a significant challenge. Existing methods are either limited by a lack of safety and agility in simple obstacle distributions or suffer from slow locomotion in complex environments, often requiring excessively long training phases. To this end, we propose SEA-Nav (Safe, Efficient, and Agile Navigation), a reinforcement learning framework for quadruped navigation. Within diverse and dense obstacle environments, a differentiable control barrier function (CBF)-based shield constraints the navigation policy to output safe velocity commands. An adaptive collision replay mechanism and hazardous exploration rewards are introduced to increase the probability of learning from critical experiences, guiding efficient exploration and exploitation. Finally, kinematic action constraints are incorporated to ensure safe velocity commands, facilitating successful physical deployment. To the best of our knowledge, this is the first approach that achieves highly challenging quadruped navigation in the real world with minute-level training time.
翻译:在密集杂乱环境中高效训练四足机器人导航仍然是一个重大挑战。现有方法要么受限于简单障碍物分布下安全性与敏捷性的不足,要么在复杂环境中运动缓慢,且通常需要过长的训练阶段。为此,我们提出SEA-Nav(安全、高效、敏捷导航),一种用于四足导航的强化学习框架。在多样且密集的障碍物环境中,基于可微控制屏障函数(CBF)的防护层约束导航策略,使其输出安全的速度指令。我们引入了自适应碰撞回放机制与危险探索奖励,以提高从关键经验中学习的概率,从而引导高效的探索与利用。最后,通过纳入运动学动作约束来确保速度指令的安全性,促进成功的物理部署。据我们所知,这是首个在现实世界中实现极具挑战性的四足机器人导航、且仅需分钟级训练时间的方法。