Designing a humanoid locomotion controller is challenging and classically split up in sub-problems. Footstep planning is one of those, where the sequence of footsteps is defined. Even in simpler environments, finding a minimal sequence, or even a feasible sequence, yields a complex optimization problem. In the literature, this problem is usually addressed by search-based algorithms (e.g. variants of A*). However, such approaches are either computationally expensive or rely on hand-crafted tuning of several parameters. In this work, at first, we propose an efficient footstep planning method to navigate in local environments with obstacles, based on state-of-the art Deep Reinforcement Learning (DRL) techniques, with very low computational requirements for on-line inference. Our approach is heuristic-free and relies on a continuous set of actions to generate feasible footsteps. In contrast, other methods necessitate the selection of a relevant discrete set of actions. Second, we propose a forecasting method, allowing to quickly estimate the number of footsteps required to reach different candidates of local targets. This approach relies on inherent computations made by the actor-critic DRL architecture. We demonstrate the validity of our approach with simulation results, and by a deployment on a kid-size humanoid robot during the RoboCup 2023 competition.
翻译:设计仿人运动控制器具有挑战性,传统上需分解为多个子问题。步态规划是其中之一,用于定义步态的序列。即使在简单环境中,寻找最小化序列甚至可行序列都会产生复杂的优化问题。现有文献通常采用基于搜索的算法(例如A*的变体)解决该问题,但此类方法要么计算成本高昂,要么依赖人工调谐多个参数。本文首先提出一种基于最新深度强化学习(DRL)技术的高效步态规划方法,可在低计算需求下在线导航局部障碍环境。该方法无需启发式规则,依赖连续动作集生成可行步态,而其他方法则需要选择离散动作集。其次,我们提出一种预测方法,可快速估算到达不同局部目标候选点所需的步数。该方法利用Actor-Critic DRL架构的内在计算结果。我们通过仿真结果以及在RoboCup 2023竞赛中部署于儿童尺寸仿人机器人的实验,验证了该方法的有效性。