Controlled execution of dynamic motions in quadrupedal robots, especially those with articulated soft bodies, presents a unique set of challenges that traditional methods struggle to address efficiently. In this study, we tackle these issues by relying on a simple yet effective two-stage learning framework to generate dynamic motions for quadrupedal robots. First, a gradient-free evolution strategy is employed to discover simply represented control policies, eliminating the need for a predefined reference motion. Then, we refine these policies using deep reinforcement learning. Our approach enables the acquisition of complex motions like pronking and back-flipping, effectively from scratch. Additionally, our method simplifies the traditionally labour-intensive task of reward shaping, boosting the efficiency of the learning process. Importantly, our framework proves particularly effective for articulated soft quadrupeds, whose inherent compliance and adaptability make them ideal for dynamic tasks but also introduce unique control challenges.
翻译:四足机器人(尤其是具有铰接软体结构的机器人)在执行受控动态运动时面临一系列独特挑战,传统方法难以高效应对。本研究通过一个简单而高效的两阶段学习框架,解决了四足机器人动态运动生成问题。首先,采用无梯度进化策略发现具有简洁表示形式的控制策略,从而无需预定义参考运动;继而通过深度强化学习对策略进行精细化优化。本方法能够从零开始学习诸如跳跃和后空翻等复杂运动。此外,该方法简化了传统上劳动密集型的奖励塑造任务,提升了学习过程的效率。尤为重要的是,该框架对铰接软体四足机器人展现出显著优势——尽管其固有的柔顺性与适应性使其成为动态任务的理想载体,但同时也带来了独特的控制挑战。