With the development of hardware accelerators and their corresponding tools, evaluations have become more affordable through fast and massively parallel evaluations in some applications. This advancement has drastically sped up the runtime of evolution-inspired algorithms such as Quality-Diversity optimization, creating tremendous potential for algorithmic innovation through scale. In this work, we propose MAP-Elites-Multi-ES (MEMES), a novel QD algorithm based on Evolution Strategies (ES) designed for fast parallel evaluations. ME-Multi-ES builds on top of the existing MAP-Elites-ES algorithm, scaling it by maintaining multiple independent ES threads with massive parallelization. We also introduce a new dynamic reset procedure for the lifespan of the independent ES to autonomously maximize the improvement of the QD population. We show experimentally that MEMES outperforms existing gradient-based and objective-agnostic QD algorithms when compared in terms of generations. We perform this comparison on both black-box optimization and QD-Reinforcement Learning tasks, demonstrating the benefit of our approach across different problems and domains. Finally, we also find that our approach intrinsically enables optimization of fitness locally around a niche, a phenomenon not observed in other QD algorithms.
翻译:随着硬件加速器及其配套工具的发展,通过快速大规模并行评估,某些应用中的评估成本已显著降低。这一进步极大加快了基于进化启发的算法(如质量-多样性优化)的运行速度,为通过规模实现算法创新创造了巨大潜力。本文提出MAP-Elites-Multi-ES(MEMES),一种基于进化策略(ES)的新型质量-多样性算法,专为快速并行评估设计。MEMES在现有MAP-Elites-ES算法基础上进行扩展,通过维护多个独立ES线程并实现大规模并行化来提升规模。同时,我们引入了一种针对独立ES生命周期的动态重置流程,以自主最大化质量-多样性种群的改进。实验表明,在世代数比较中,MEMES优于现有基于梯度和无目标函数的质量-多样性算法。我们在黑箱优化和质量-多样性强化学习任务上进行了对比,证明了该方法在不同问题与领域中的优势。最后,我们还发现该方法内在支持在生态位局部优化适应度,这是其他质量-多样性算法未观察到的现象。