In this paper, we propose a novel framework for synthesizing a single multimodal control policy capable of generating diverse behaviors (or modes) and emergent inherent transition maneuvers for bipedal locomotion. In our method, we first learn efficient latent encodings for each behavior by training an autoencoder from a dataset of rough reference motions. These latent encodings are used as commands to train a multimodal policy through an adaptive sampling of modes and transitions to ensure consistent performance across different behaviors. We validate the policy performance in simulation for various distinct locomotion modes such as walking, leaping, jumping on a block, standing idle, and all possible combinations of inter-mode transitions. Finally, we integrate a task-based planner to rapidly generate open-loop mode plans for the trained multimodal policy to solve high-level tasks like reaching a goal position on a challenging terrain. Complex parkour-like motions by smoothly combining the discrete locomotion modes were generated in 3 min. to traverse tracks with a gap of width 0.45 m, a plateau of height 0.2 m, and a block of height 0.4 m, which are all significant compared to the dimensions of our mini-biped platform.
翻译:本文提出了一种新颖框架,用于合成单一的多模态控制策略,该策略能够为双足行走生成多样化行为(或模态)及涌现的固有过渡操作。在方法中,我们首先基于粗糙参考动作数据集训练自编码器,学习每种行为的高效潜在编码。这些潜在编码作为指令,通过自适应采样模态与过渡训练多模态策略,确保不同行为间性能的一致性。我们在仿真中验证了策略在多种不同步态模式下的表现,如行走、跳跃、跳上高台、静止站立以及所有模态间过渡的组合。最后,我们集成基于任务的规划器,快速为训练好的多模态策略生成开环模态规划,以解决更具挑战性的高层任务,例如在复杂地形上到达目标位置。通过平滑组合离散步态模式,在3分钟内生成类似跑酷的复杂动作,成功跨越宽度0.45米的间隙、高度0.2米的平台以及高度0.4米的高台——这些数值相对于我们微型双足平台尺寸均具有显著挑战性。