Zero-shot coordination in cooperative artificial intelligence (AI) remains a significant challenge, which means effectively coordinating with a wide range of unseen partners. Previous algorithms have attempted to address this challenge by optimizing fixed objectives within a population to improve strategy or behaviour diversity. However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility. To address this issue, we propose the Cooperative Open-ended LEarning (COLE) framework, which constructs open-ended objectives in cooperative games with two players from the perspective of graph theory to assess and identify the cooperative ability of each strategy. We further specify the framework and propose a practical algorithm that leverages knowledge from game theory and graph theory. Furthermore, an analysis of the learning process of the algorithm shows that it can efficiently overcome cooperative incompatibility. The experimental results in the Overcooked game environment demonstrate that our method outperforms current state-of-the-art methods when coordinating with different-level partners. Our demo is available at https://sites.google.com/view/cole-2023.
翻译:在合作型人工智能(AI)领域中,零样本协调仍是一项重大挑战,其核心在于能与广泛未见过的伙伴进行有效协调。以往的算法通过优化群体内的固定目标来提升策略或行为多样性,试图应对这一挑战。然而,此类方法可能导致学习能力丧失,且无法与群体内某些策略协作——即所谓的协作不兼容问题。为解决这一问题,我们提出合作式开放学习(COLE)框架,该框架从图论视角出发,在双人合作博弈中构建开放式目标,以评估并识别各策略的合作能力。我们进一步明确框架细节,提出一种融合博弈论与图论知识的实用算法。此外,对算法学习过程的分析表明,其能有效克服协作不兼容问题。在Overcooked游戏环境中的实验结果显示,与不同水平伙伴协作时,我们的方法优于当前最先进方法。我们的演示页面位于https://sites.google.com/view/cole-2023。