We present DARLEI, a framework that combines evolutionary algorithms with parallelized reinforcement learning for efficiently training and evolving populations of UNIMAL agents. Our approach utilizes Proximal Policy Optimization (PPO) for individual agent learning and pairs it with a tournament selection-based generational learning mechanism to foster morphological evolution. By building on Nvidia's Isaac Gym, DARLEI leverages GPU accelerated simulation to achieve over 20x speedup using just a single workstation, compared to previous work which required large distributed CPU clusters. We systematically characterize DARLEI's performance under various conditions, revealing factors impacting diversity of evolved morphologies. For example, by enabling inter-agent collisions within the simulator, we find that we can simulate some multi-agent interactions between the same morphology, and see how it influences individual agent capabilities and long-term evolutionary adaptation. While current results demonstrate limited diversity across generations, we hope to extend DARLEI in future work to include interactions between diverse morphologies in richer environments, and create a platform that allows for coevolving populations and investigating emergent behaviours in them. Our source code is also made publicly at https://saeejithnair.github.io/darlei.
翻译:我们提出DARLEI框架,该框架将进化算法与并行化强化学习相结合,用于高效训练和进化UNIMAL智能体种群。我们的方法采用近端策略优化(PPO)进行个体智能体学习,并结合基于锦标赛选择的代际学习机制以促进形态进化。通过基于Nvidia Isaac Gym平台,DARLEI利用GPU加速仿真,仅需单台工作站即可实现相比先前依赖大型分布式CPU集群的工作超过20倍的加速效果。我们系统性地刻画了DARLEI在不同条件下的性能表现,揭示了影响进化形态多样性的关键因素。例如,通过在仿真器中启用智能体间碰撞机制,我们发现可以模拟相同形态间的多智能体交互,并观察其对个体智能体能力及长期进化适应性的影响。尽管当前实验结果展示的跨代多样性有限,我们期望在未来工作中将DARLEI扩展至包含更丰富环境中不同形态间的交互,并创建能够实现种群协同进化及涌现行为研究的平台。相关源代码已公开于https://saeejithnair.github.io/darlei。