With the increasing computing power, using data-driven approaches to co-design a robot's morphology and controller has become a promising way. However, most existing data-driven methods require training the controller for each morphology to calculate fitness, which is time-consuming. In contrast, the dual-network framework utilizes data collected by individual networks under a specific morphology to train a population network that provides a surrogate function for morphology optimization. This approach replaces the traditional evaluation of a diverse set of candidates, thereby speeding up the training. Despite considerable results, the online training of both networks impedes their performance. To address this issue, we propose a concurrent network framework that combines online and offline reinforcement learning (RL) methods. By leveraging the behavior cloning term in a flexible manner, we achieve an effective combination of both networks. We conducted multiple sets of comparative experiments in the simulator and found that the proposed method effectively addresses issues present in the dual-network framework, leading to overall algorithmic performance improvement. Furthermore, we validated the algorithm on a real robot, demonstrating its feasibility in a practical application.
翻译:随着计算能力的不断提升,采用数据驱动方法协同设计机器人形态与控制器已成为一条有前景的路径。然而,现有数据驱动方法大多需要为每种形态训练控制器以计算适应度,耗时较长。相比之下,双网络框架利用在特定形态下由各独立网络收集的数据来训练一个群体网络,为形态优化提供代理函数。该方法取代了对多样化候选集的传统评估,从而加速了训练过程。尽管取得了显著成果,但两个网络的在线训练仍制约了其性能。为解决这一问题,我们提出了一种结合在线与离线强化学习方法的并发网络框架。通过灵活运用行为克隆项,我们实现了两个网络的有效融合。我们在仿真器中进行了多组对比实验,结果表明所提方法有效解决了双网络框架中的问题,整体算法性能得到提升。此外,我们还在真实机器人上验证了该算法,证明了其在实际应用中的可行性。