Controlled execution of dynamic motions in quadrupedal robots, especially those with articulated soft bodies, presents a unique set of challenges that traditional methods struggle to address efficiently. In this study, we tackle these issues by relying on a simple yet effective two-stage learning framework to generate dynamic motions for quadrupedal robots. First, a gradient-free evolution strategy is employed to discover simply represented control policies, eliminating the need for a predefined reference motion. Then, we refine these policies using deep reinforcement learning. Our approach enables the acquisition of complex motions like pronking and back-flipping, effectively from scratch. Additionally, our method simplifies the traditionally labour-intensive task of reward shaping, boosting the efficiency of the learning process. Importantly, our framework proves particularly effective for articulated soft quadrupeds, whose inherent compliance and adaptability make them ideal for dynamic tasks but also introduce unique control challenges.
翻译:在四足机器人(尤其是具有关节式软体的机器人)中受控执行动态运动会带来一系列传统方法难以高效处理的独特挑战。本研究通过采用简单而有效的两阶段学习框架来生成四足机器人动态运动,以解决这些问题。首先,采用无梯度进化策略来发现具有简单表示形式的控制策略,从而无需预定义参考运动。随后,利用深度强化学习对这些策略进行优化。该方法使得四足机器人能够从零开始有效习得跳跃和后空翻等复杂运动。此外,本研究简化了传统上耗时的奖励塑形任务,提升了学习过程的效率。重要的是,该框架对关节式软体四足机器人尤为有效——这类机器人固有的柔顺性和适应性虽使其成为动态任务的理想载体,但也带来了独特的控制挑战。