Zero-shot coordination in cooperative artificial intelligence (AI) remains a significant challenge, which means effectively coordinating with a wide range of unseen partners. Previous algorithms have attempted to address this challenge by optimizing fixed objectives within a population to improve strategy or behaviour diversity. However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility. To address this issue, we propose the Cooperative Open-ended LEarning (COLE) framework, which constructs open-ended objectives in cooperative games with two players from the perspective of graph theory to assess and identify the cooperative ability of each strategy. We further specify the framework and propose a practical algorithm that leverages knowledge from game theory and graph theory. Furthermore, an analysis of the learning process of the algorithm shows that it can efficiently overcome cooperative incompatibility. The experimental results in the Overcooked game environment demonstrate that our method outperforms current state-of-the-art methods when coordinating with different-level partners. Our demo is available at https://sites.google.com/view/cole-2023.
翻译:在协作人工智能(AI)中,零样本协调仍是一项重大挑战,这意味着需要与各种未见过的伙伴进行有效协调。以往算法试图通过优化群体内的固定目标来提升策略或行为多样性,从而解决该问题。然而,此类方法可能导致学习失效,且无法与群体内某些策略协作,即所谓的协作不兼容性。针对这一问题,我们提出协作式开放学习(COLE)框架,该框架从图论视角在双人协作游戏中构建开放式目标,以评估并识别各策略的协作能力。我们进一步具体化该框架,提出一种融合博弈论与图论知识的实用算法。对算法学习过程的分析表明,它能有效克服协作不兼容性。在Overcooked游戏环境中的实验结果显示,与不同水平的伙伴协调时,我们的方法优于当前最先进的方法。我们的演示内容详见:https://sites.google.com/view/cole-2023