We investigate the problem of transferring an expert policy from a source robot to multiple different robots. To solve this problem, we propose a method named $Meta$-$Evolve$ that uses continuous robot evolution to efficiently transfer the policy to each target robot through a set of tree-structured evolutionary robot sequences. The robot evolution tree allows the robot evolution paths to be shared, so our approach can significantly outperform naive one-to-one policy transfer. We present a heuristic approach to determine an optimized robot evolution tree. Experiments have shown that our method is able to improve the efficiency of one-to-three transfer of manipulation policy by up to 3.2$\times$ and one-to-six transfer of agile locomotion policy by 2.4$\times$ in terms of simulation cost over the baseline of launching multiple independent one-to-one policy transfers.
翻译:我们研究了将专家策略从源机器人迁移至多个不同机器人的问题。为解决该问题,我们提出了一种名为Meta-Evolve的方法,该方法通过一组树形结构的演化机器人序列,利用连续机器人演化将策略高效迁移至每个目标机器人。机器人演化树允许共享机器人演化路径,因此我们的方法可显著优于朴素的一对一策略迁移。我们提出了一种启发式方法来确定优化的机器人演化树。实验表明,与启动多个独立一对一策略迁移的基线方法相比,我们的方法在仿真代价方面,可将操作策略的一对三迁移效率提升高达3.2倍,将敏捷运动策略的一对六迁移效率提升2.4倍。