Learning a locomotion controller for a musculoskeletal system is challenging due to over-actuation and high-dimensional action space. While many reinforcement learning methods attempt to address this issue, they often struggle to learn human-like gaits because of the complexity involved in engineering an effective reward function. In this paper, we demonstrate that adversarial imitation learning can address this issue by analyzing key problems and providing solutions using both current literature and novel techniques. We validate our methodology by learning walking and running gaits on a simulated humanoid model with 16 degrees of freedom and 92 Muscle-Tendon Units, achieving natural-looking gaits with only a few demonstrations.
翻译:学习肌肉骨骼系统的步态控制器具有挑战性,主要源于其超驱动特性与高维动作空间。尽管许多强化学习方法试图解决这一问题,但由于设计有效奖励函数的复杂性,它们往往难以学习到类人步态。本文通过分析关键问题,并结合现有文献与新颖技术提出解决方案,论证了对抗性模仿学习能够有效应对这一挑战。我们在一个具有16个自由度和92个肌肉肌腱单元的仿真人形模型上验证了方法,仅需少量演示即可习得自然流畅的行走与奔跑步态。