This paper presents a method for achieving high-speed running of a quadruped robot by considering the actuator torque-speed operating region in reinforcement learning. The physical properties and constraints of the actuator are included in the training process to reduce state transitions that are infeasible in the real world due to motor torque-speed limitations. The gait reward is designed to distribute motor torque evenly across all legs, contributing to more balanced power usage and mitigating performance bottlenecks due to single-motor saturation. Additionally, we designed a lightweight foot to enhance the robot's agility. We observed that applying the motor operating region as a constraint helps the policy network avoid infeasible areas during sampling. With the trained policy, KAIST Hound, a 45 kg quadruped robot, can run up to 6.5 m/s, which is the fastest speed among electric motor-based quadruped robots.
翻译:本文提出了一种在强化学习中考虑执行器转矩-转速工作区域以实现四足机器人高速奔跑的方法。在训练过程中纳入执行器的物理特性与约束条件,以减少因电机转矩-转速限制而在现实世界中不可行的状态转移。步态奖励被设计为均匀分配各腿的电机转矩,从而促进更均衡的功率利用,缓解因单电机饱和导致的性能瓶颈。此外,我们设计了一款轻量化足端以提升机器人的敏捷性。实验表明,将电机工作区域作为约束有助于策略网络在采样过程中避开不可行区域。基于训练后的策略,45千克级四足机器人KAIST Hound可实现6.5米/秒的奔跑速度,这是目前基于电机的四足机器人所达到的最高速度。