Achieving coordination between humans and artificial intelligence in scenarios involving previously unencountered humans remains a substantial obstacle within Zero-Shot Human-AI Coordination, which aims to develop AI agents capable of efficiently working alongside previously unknown human teammates. Traditional algorithms have aimed to collaborate with humans by optimizing fixed objectives within a population, fostering diversity in strategies and behaviors. However, these techniques may lead to learning loss and an inability to cooperate with specific strategies within the population, a phenomenon named cooperative incompatibility. To mitigate this issue, we introduce the Cooperative Open-ended LEarning (COLE) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy. We put forth a practical algorithm incorporating insights from game theory and graph theory, e.g., Shapley Value and Centrality. We also show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis. Subsequently, we created an online Overcooked human-AI experiment platform, the COLE platform, which enables easy customization of questionnaires, model weights, and other aspects. Utilizing the COLE platform, we enlist 130 participants for human experiments. Our findings reveal a preference for our approach over state-of-the-art methods using a variety of subjective metrics. Moreover, objective experimental outcomes in the Overcooked game environment indicate that our method surpasses existing ones when coordinating with previously unencountered AI agents and the human proxy model. Our code and demo are publicly available at https://sites.google.com/view/cole-2023.
翻译:在零样本人机协调(Zero-Shot Human-AI Coordination)中,实现人工智能与从未遇到过的人类之间的协调仍是一项重大挑战,其目标是开发能够与未知人类队友高效协作的智能体。传统算法通过优化群体内的固定目标、促进策略与行为的多样性来试图与人类合作。然而,这些方法可能导致学习损失,并无法与群体中的特定策略进行协作——这一现象被称为“协作不兼容”。为解决此问题,我们提出了协作开放式学习(COLE)框架,该框架利用图论视角在双人合作游戏中制定开放式目标,以评估并定位每种策略的协作能力。我们提出了一种结合博弈论与图论(例如沙普利值和中心性)的实用算法。通过理论与实证分析,我们证明COLE能有效克服协作不兼容问题。随后,我们搭建了在线Overcooked人机实验平台——COLE平台,支持问卷、模型权重等方面的便捷定制。借助该平台,我们招募了130名受试者进行人类实验。结果表明,在多种主观评价指标上,我们的方法优于现有最优方法。此外,在Overcooked游戏环境中的客观实验结果也显示,当与从未遇到的AI智能体及人类代理模型协调时,我们的方法超越了现有方案。我们的代码与演示公开于https://sites.google.com/view/cole-2023。