The main question this paper addresses is: What combination of a robot controller and a learning method should be used, if the morphology of the learning robot is not known in advance? Our interest is rooted in the context of morphologically evolving modular robots, but the question is also relevant in general, for system designers interested in widely applicable solutions. We perform an experimental comparison of three controller-and-learner combinations: one approach where controllers are based on modelling animal locomotion (Central Pattern Generators, CPG) and the learner is an evolutionary algorithm, a completely different method using Reinforcement Learning (RL) with a neural network controller architecture, and a combination `in-between' where controllers are neural networks and the learner is an evolutionary algorithm. We apply these three combinations to a test suite of modular robots and compare their efficacy, efficiency, and robustness. Surprisingly, the usual CPG-based and RL-based options are outperformed by the in-between combination that is more robust and efficient than the other two setups.
翻译:本文探讨的核心问题是:当学习机器人的形态事先未知时,应如何选择机器人控制器与学习方法的组合?该研究源于形态可进化模块化机器人的背景,但对于寻求广泛适用方案的系统设计师而言同样具有普遍意义。我们通过实验比较了三种控制器与学习器的组合方案:第一种基于动物运动建模的控制器架构(中央模式发生器,CPG)配以进化算法;第二种采用强化学习(RL)与神经网络控制器架构的完全不同的方法;第三种是介于两者之间的组合——神经网络控制器配合进化算法。我们将这三种组合应用于模块化机器人测试套件,对比其效能、效率与鲁棒性。令人惊讶的是,常规的CPG方案和强化学习方案均被第三种折中方案超越,该方案在鲁棒性和效率方面均优于前两种方案。