In search of the simplest baseline capable of competing with Deep Reinforcement Learning on locomotion tasks, we propose a biologically inspired model-free open-loop strategy. Drawing upon prior knowledge and harnessing the elegance of simple oscillators to generate periodic joint motions, it achieves respectable performance in five different locomotion environments, with a number of tunable parameters that is a tiny fraction of the thousands typically required by RL algorithms. Unlike RL methods, which are prone to performance degradation when exposed to sensor noise or failure, our open-loop oscillators exhibit remarkable robustness due to their lack of reliance on sensors. Furthermore, we showcase a successful transfer from simulation to reality using an elastic quadruped, all without the need for randomization or reward engineering.
翻译:在寻找能够与深度强化学习在运动任务上竞争的简单基线方法的过程中,我们提出了一种受生物学启发、无模型的开环策略。通过利用先验知识并借助简单振荡器的优雅性来生成周期性关节运动,该方法在五个不同的运动环境中取得了可观的性能,其可调参数数量仅为强化学习算法通常所需的数千个参数的一小部分。与强化学习方法在传感器噪声或故障下容易性能退化不同,我们的开环振荡器由于不依赖传感器,展现出显著的鲁棒性。此外,我们展示了从仿真到现实的成功迁移,使用弹性四足机器人,无需随机化或奖励工程设计。