Recent advances in reinforcement learning (RL) heavily rely on a variety of well-designed benchmarks, which provide environmental platforms and consistent criteria to evaluate existing and novel algorithms. Specifically, in multi-agent RL (MARL), a plethora of benchmarks based on cooperative games have spurred the development of algorithms that improve the scalability of cooperative multi-agent systems. However, for the competitive setting, a lightweight and open-sourced benchmark with challenging gaming dynamics and visual inputs has not yet been established. In this work, we present FightLadder, a real-time fighting game platform, to empower competitive MARL research. Along with the platform, we provide implementations of state-of-the-art MARL algorithms for competitive games, as well as a set of evaluation metrics to characterize the performance and exploitability of agents. We demonstrate the feasibility of this platform by training a general agent that consistently defeats 12 built-in characters in single-player mode, and expose the difficulty of training a non-exploitable agent without human knowledge and demonstrations in two-player mode. FightLadder provides meticulously designed environments to address critical challenges in competitive MARL research, aiming to catalyze a new era of discovery and advancement in the field. Videos and code at https://sites.google.com/view/fightladder/home.
翻译:近年来,强化学习(RL)领域的重大进展在很大程度上依赖于一系列精心设计的基准测试,这些基准提供了环境平台和一致的评估标准,用于检验现有及新提出的算法。具体而言,在多智能体强化学习(MARL)领域,大量基于合作游戏的基准测试推动了旨在提升合作多智能体系统可扩展性的算法发展。然而,对于竞争性环境,目前尚缺乏一个轻量级、开源且具备挑战性游戏动态与视觉输入的基准平台。在本工作中,我们提出了FightLadder——一个实时格斗游戏平台,以推动竞争性MARL研究。除了该平台本身,我们还提供了适用于竞争性游戏的先进MARL算法实现,以及一套用于刻画智能体性能与可被利用性的评估指标。我们通过训练一个通用智能体,使其在单人模式下能持续击败12个内置角色,验证了该平台的可行性;同时,在双人模式下,我们揭示了在没有人类先验知识与演示的情况下,训练一个不易被利用的智能体所面临的困难。FightLadder提供了精心设计的环境,以应对竞争性MARL研究中的关键挑战,旨在推动该领域进入一个全新的发现与进步时期。视频与代码详见 https://sites.google.com/view/fightladder/home。