Central Pattern Generators (CPGs) form the neural basis of the observed rhythmic behaviors for locomotion in legged animals. The CPG dynamics organized into networks allow the emergence of complex locomotor behaviors. In this work, we take this inspiration for developing walking behaviors in multi-legged robots. We present novel DeepCPG policies that embed CPGs as a layer in a larger neural network and facilitate end-to-end learning of locomotion behaviors in deep reinforcement learning (DRL) setup. We demonstrate the effectiveness of this approach on physics engine-based insectoid robots. We show that, compared to traditional approaches, DeepCPG policies allow sample-efficient end-to-end learning of effective locomotion strategies even in the case of high-dimensional sensor spaces (vision). We scale the DeepCPG policies using a modular robot configuration and multi-agent DRL. Our results suggest that gradual complexification with embedded priors of these policies in a modular fashion could achieve non-trivial sensor and motor integration on a robot platform. These results also indicate the efficacy of bootstrapping more complex intelligent systems from simpler ones based on biological principles. Finally, we present the experimental results for a proof-of-concept insectoid robot system for which DeepCPG learned policies initially using the simulation engine and these were afterwards transferred to real-world robots without any additional fine-tuning.
翻译:中央模式发生器(CPGs)构成了有腿动物运动中观察到的节律性行为的神经基础。组织成网络的CPG动力学允许复杂运动行为的涌现。在这项工作中,我们以此为灵感开发多足机器人的行走行为。我们提出了新颖的DeepCPG策略,该策略将CPG作为层嵌入更大的神经网络中,并促进了深度强化学习(DRL)设置下运动行为的端到端学习。我们在基于物理引擎的昆虫形机器人上展示了该方法的有效性。我们表明,与传统方法相比,DeepCPG策略即使在传感器空间维度高(视觉)的情况下也能实现样本高效的端到端运动策略学习。我们使用模块化机器人配置和多智能体DRL对DeepCPG策略进行扩展。我们的结果表明,以模块化方式逐步复杂化这些策略并嵌入先验知识,可以在机器人平台上实现非平凡的传感器与电机集成。这些结果也表明了基于生物学原理从更简单的系统引导出更复杂的智能系统的有效性。最后,我们展示了一个概念验证昆虫形机器人系统的实验结果,该系统最初使用模拟引擎学习DeepCPG策略,随后这些策略无需任何额外微调即可迁移到真实机器人。