Learning diverse skills is one of the main challenges in robotics. To this end, imitation learning approaches have achieved impressive results. These methods require explicitly labeled datasets or assume consistent skill execution to enable learning and active control of individual behaviors, which limits their applicability. In this work, we propose a cooperative adversarial method for obtaining single versatile policies with controllable skill sets from unlabeled datasets containing diverse state transition patterns by maximizing their discriminability. Moreover, we show that by utilizing unsupervised skill discovery in the generative adversarial imitation learning framework, novel and useful skills emerge with successful task fulfillment. Finally, the obtained versatile policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations.
翻译:学习多样化技能是机器人领域的主要挑战之一。为此,模仿学习方法已取得了显著成果。但这些方法需要明确标注的数据集或假设技能执行的一致性,以实现对个体行为的学习与主动控制,从而限制了其适用性。在本工作中,我们提出了一种协作式对抗方法,通过最大化未标记数据集中包含不同状态转换模式的判别性,从该数据集中获取具有可控技能集的单一多功能策略。此外,我们证明在生成对抗模仿学习框架中利用无监督技能发现,能够涌现出成功完成任务的新颖且实用的技能。最终,所获得的多功能策略在名为Solo 8的敏捷四足机器人上进行了测试,并忠实复现了演示中编码的多种技能。