Some standardized environments have been designed for partially observable multi-agent cooperation, but we find most current environments are synchronous, whereas real-world agents often have their own action spaces leading to asynchrony. Furthermore, fixed agents number limits the scalability of action space, whereas in reality agents number can change resulting in a flexible action space. In addition, current environments are balanced, which is not always the case in the real world where there may be an ability gap between different parties leading to asymmetry. Finally, current environments tend to have less stochasticity with simple state transitions, whereas real-world environments can be highly stochastic and result in extremely risky. To address this gap, we propose WarGame Challenge (WGC) inspired by the Wargame. WGC is a lightweight, flexible, and easy-to-use environment with a clear framework that can be easily configured by users. Along with the benchmark, we provide MARL baseline algorithms such as QMIX and a toolkit to help algorithms complete performance tests on WGC. Finally, we present baseline experiment results, which demonstrate the challenges of WGC. We think WGC enrichs the partially observable multi-agent cooperation domain and introduces more challenges that better reflect the real-world characteristics. Code is release in http://turingai.ia.ac.cn/data\_center/show/10.
翻译:部分可观测多智能体合作领域已设计出若干标准化环境,但目前大多数环境均为同步机制,而现实世界中的智能体往往具有各自独立的动作空间,导致异步性。此外,固定的智能体数量限制了动作空间的可扩展性,而现实场景中智能体数量可能动态变化,产生灵活的动作空间。同时,现有环境通常具有平衡性,这与现实世界中多方能力差异导致非对称性的情况不符。最后,当前环境普遍随机性较低,状态转换简单,而现实环境可能高度随机并蕴含极端风险。为弥合这一差距,我们借鉴军棋推演思想提出WarGame挑战(WGC)。WGC是一个轻量级、灵活且易用的环境,具有清晰的框架,用户可便捷配置。与基准测试同步,我们提供了QMIX等MARL基线算法及工具包,帮助算法在WGC上完成性能测试。最终,我们给出基线实验结果,展示了WGC的挑战性。我们认为WGC丰富了部分可观测多智能体合作领域,引入了更贴近现实世界特征的挑战。代码发布于http://turingai.ia.ac.cn/data_center/show/10。