Legged robots have become capable of performing highly dynamic maneuvers in the past few years. However, agile locomotion in highly constrained environments such as stepping stones is still a challenge. In this paper, we propose a combination of model-based control, search, and learning to design efficient control policies for agile locomotion on stepping stones. In our framework, we use nonlinear model predictive control (NMPC) to generate whole-body motions for a given contact plan. To efficiently search for an optimal contact plan, we propose to use Monte Carlo tree search (MCTS). While the combination of MCTS and NMPC can quickly find a feasible plan for a given environment (a few seconds), it is not yet suitable to be used as a reactive policy. Hence, we generate a dataset for optimal goal-conditioned policy for a given scene and learn it through supervised learning. In particular, we leverage the power of diffusion models in handling multi-modality in the dataset. We test our proposed framework on a scenario where our quadruped robot Solo12 successfully jumps to different goals in a highly constrained environment.
翻译:腿式机器人在过去几年中已展现出执行高度动态动作的能力。然而,在踏脚石这类高度约束环境中的敏捷运动仍是一项挑战。本文提出一种结合模型控制、搜索与学习的框架,旨在为踏脚石上的敏捷运动设计高效控制策略。在该框架中,我们利用非线性模型预测控制(NMPC)为给定接触规划生成全身运动。为高效搜索最优接触规划,我们提出采用蒙特卡洛树搜索(MCTS)。尽管MCTS与NMPC的结合能在数秒内为特定环境快速找到可行规划,但其尚不适用于作为反应式策略。因此,我们为特定场景生成最优目标条件策略数据集,并通过监督学习进行训练。特别地,我们利用扩散模型处理数据集中多模态特性的能力。我们在高度约束环境中测试所提框架,四足机器人Solo12成功实现了向不同目标的跳跃。