Quadrupedal locomotion is a complex, open-ended problem vital to expanding autonomous vehicle reach. Traditional reinforcement learning approaches often fall short due to training instability and sample inefficiency. We propose a novel method leveraging multi-objective evolutionary algorithms as an automatic curriculum learning mechanism, which we named Multi-Objective Learning (MOL). Our approach significantly enhances the learning process by projecting velocity commands into an objective space and optimizing for both performance and diversity. Tested within the MuJoCo physics simulator, our method demonstrates superior stability and adaptability compared to baseline approaches. As such, it achieved 19\% and 44\% fewer errors against our best baseline algorithm in difficult scenarios based on a uniform and tailored evaluation respectively. This work introduces a robust framework for training quadrupedal robots, promising significant advancements in robotic locomotion and open-ended robotic problems.
翻译:四足运动是一个复杂且开放式的关键问题,对于拓展自主车辆的作业范围至关重要。传统的强化学习方法常因训练不稳定和样本效率低下而表现不佳。我们提出了一种新颖方法,利用多目标进化算法作为自动课程学习机制,并将其命名为多目标学习(MOL)。该方法通过将速度指令映射到目标空间,并同时优化性能与多样性,显著改进了学习过程。在MuJoCo物理仿真器中的测试表明,相较于基线方法,我们的方法展现出更优的稳定性和适应性。具体而言,在基于统一评估和定制评估的困难场景中,该方法分别比我们最佳的基线算法减少了19%和44%的误差。本研究为四足机器人训练引入了一个鲁棒的框架,有望在机器人运动及开放式机器人问题领域取得显著进展。