Zero-shot coordination in cooperative artificial intelligence (AI) remains a significant challenge, which means effectively coordinating with a wide range of unseen partners. Previous algorithms have attempted to address this challenge by optimizing fixed objectives within a population to improve strategy or behavior diversity. However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility. To address this issue, we propose the Cooperative Open-ended LEarning (COLE) framework, which constructs open-ended objectives in cooperative games with two players from the perspective of graph theory to assess and identify the cooperative ability of each strategy. We further specify the framework and propose a practical algorithm that leverages knowledge from game theory and graph theory. Furthermore, an analysis of the learning process of the algorithm shows that it can efficiently overcome cooperative incompatibility. The experimental results in the Overcooked game environment demonstrate that our method outperforms current state-of-the-art methods when coordinating with different-level partners. Our code and demo are available at https://sites.google.com/view/cole-2023.
翻译:合作人工智能中的零样本协调仍是一项重大挑战,即如何有效与各种未见过的伙伴进行协调。以往算法试图通过优化群体内的固定目标来提升策略或行为多样性以应对这一挑战,但这类方法可能导致学习失效,无法与群体内某些策略协作,即产生合作不兼容问题。为解决该问题,我们提出合作开放式学习(COLE)框架,该框架从图论角度为双人合作游戏构建开放式目标,以评估并识别各策略的合作能力。我们进一步明确了该框架,提出一种融合博弈论与图论知识的实用算法。对算法学习过程的分析表明,其能有效克服合作不兼容问题。在Overcooked游戏环境中的实验结果显示,当与不同水平伙伴协调时,我们的方法优于当前最先进方法。代码与演示可在https://sites.google.com/view/cole-2023获取。