将样本继承整合到进化机器人学的贝叶斯优化中 (Integrating Sample Inheritance into Bayesian Optimization for Evolutionary Robotics)

In evolutionary robotics, robot morphologies are designed automatically using evolutionary algorithms. This creates a body-brain optimization problem, where both morphology and control must be optimized together. A common approach is to include controller optimization for each morphology, but starting from scratch for every new body may require a high controller learning budget. We address this by using Bayesian optimization for controller optimization, exploiting its sample efficiency and strong exploration capabilities, and using sample inheritance as a form of Lamarckian inheritance. Under a deliberately low controller learning budget for each morphology, we investigate two types of sample inheritance: (1) transferring all the parent's samples to the offspring to be used as prior without evaluating them, and (2) reevaluating the parent's best samples on the offspring. Both are compared to a baseline without inheritance. Our results show that reevaluation performs best, with prior-based inheritance also outperforming no inheritance. Analysis reveals that while the learning budget is too low for a single morphology, generational inheritance compensates for this by accumulating learned adaptations across generations. Furthermore, inheritance mainly benefits offspring morphologies that are similar to their parents. Finally, we demonstrate the critical role of the environment, with more challenging environments resulting in more stable walking gaits. Our findings highlight that inheritance mechanisms can boost performance in evolutionary robotics without needing large learning budgets, offering an efficient path toward more capable robot design.

翻译：在进化机器人学中，机器人形态通常通过进化算法自动设计。这产生了体脑协同优化问题，即形态与控制必须同时优化。常见方法是为每个形态单独优化控制器，但每个新身体从头开始学习可能需要高昂的控制学习成本。我们通过使用贝叶斯优化进行控制器优化来解决此问题，利用其样本高效性和强大的探索能力，并采用样本继承作为拉马克式继承的一种形式。在刻意设定每个形态的低控制器学习成本条件下，我们研究了两种样本继承方式：(1) 将父代所有样本转移给子代作为先验知识而不重新评估，(2) 在子代上重新评估父代的最佳样本。两种方法均与无继承的基线进行比较。结果表明，重新评估方法表现最佳，基于先验的继承也优于无继承。分析表明，虽然单个形态的学习成本过低，但代际继承通过累积跨代学习适应弥补了这一点。此外，继承主要使与父代相似的子代形态受益。最后，我们证明了环境的关键作用：更具挑战性的环境能产生更稳定的行走步态。我们的研究结果强调，继承机制可以在不需要大量学习成本的情况下提升进化机器人学的性能，为设计更具能力的机器人提供了高效路径。