In many real-world systems, such as adaptive robotics, achieving a single, optimised solution may be insufficient. Instead, a diverse set of high-performing solutions is often required to adapt to varying contexts and requirements. This is the realm of Quality-Diversity (QD), which aims to discover a collection of high-performing solutions, each with their own unique characteristics. QD methods have recently seen success in many domains, including robotics, where they have been used to discover damage-adaptive locomotion controllers. However, most existing work has focused on single-agent settings, despite many tasks of interest being multi-agent. To this end, we introduce Mix-ME, a novel multi-agent variant of the popular MAP-Elites algorithm that forms new solutions using a crossover-like operator by mixing together agents from different teams. We evaluate the proposed methods on a variety of partially observable continuous control tasks. Our evaluation shows that these multi-agent variants obtained by Mix-ME not only compete with single-agent baselines but also often outperform them in multi-agent settings under partial observability.
翻译:摘要:在许多现实世界系统(如自适应机器人技术)中,获得单一优化解决方案可能是不够的。相反,通常需要一组多样化的高性能解决方案来适应不同的环境和需求。这正是质量-多样性(QD)的研究范畴,其目标在于发现一组高性能解决方案,每个方案都具有独特的特征。QD方法已在多个领域取得近期成功,包括机器人技术领域,它们被用于发现损伤自适应的运动控制器。然而,尽管许多感兴趣的任务涉及多智能体场景,现有研究大多集中于单智能体设置。为此,我们提出Mix-ME,这是流行算法MAP-Elites的一种新型多智能体变体,它通过混合来自不同团队的智能体,采用类似交叉的算子形成新解决方案。我们在多种部分可观测的连续控制任务上评估了所提方法。评估结果表明,通过Mix-ME获得的多智能体变体不仅可与单智能体基线相竞争,而且在部分可观测的多智能体设置中往往优于它们。