Sample efficient learning of manipulation skills poses a major challenge in robotics. While recent approaches demonstrate impressive advances in the type of task that can be addressed and the sensing modalities that can be incorporated, they still require large amounts of training data. Especially with regard to learning actions on robots in the real world, this poses a major problem due to the high costs associated with both demonstrations and real-world robot interactions. To address this challenge, we introduce BOpt-GMM, a hybrid approach that combines imitation learning with own experience collection. We first learn a skill model as a dynamical system encoded in a Gaussian Mixture Model from a few demonstrations. We then improve this model with Bayesian optimization building on a small number of autonomous skill executions in a sparse reward setting. We demonstrate the sample efficiency of our approach on multiple complex manipulation skills in both simulations and real-world experiments. Furthermore, we make the code and pre-trained models publicly available at http://bopt-gmm. cs.uni-freiburg.de.
翻译:操作技能的样本高效学习是机器人领域的一项重大挑战。尽管近期方法在可处理的任务类型及可集成的感知模态方面取得了显著进展,但它们仍然需要大量训练数据。尤其是在现实世界机器人动作学习方面,由于示范和真实机器人交互均伴随高昂成本,这构成了一个主要问题。为应对这一挑战,我们提出了BOpt-GMM——一种将模仿学习与自身体验收集相结合的混合方法。我们首先从少量示范中,将以高斯混合模型编码的动态系统作为技能模型进行学习。然后,在稀疏奖励设定下,基于少量自主技能执行,利用贝叶斯优化对该模型进行改进。我们在仿真和真实世界实验中,展示了该方法在多种复杂操作技能上的样本效率。此外,我们在http://bopt-gmm.cs.uni-freiburg.de 公开提供了代码和预训练模型。