The balancing process for game levels in a competitive two-player context involves a lot of manual work and testing, particularly in non-symmetrical game levels. In this paper, we propose an architecture for automated balancing of tile-based levels within the recently introduced PCGRL framework (procedural content generation via reinforcement learning). Our architecture is divided into three parts: (1) a level generator, (2) a balancing agent and, (3) a reward modeling simulation. By playing the level in a simulation repeatedly, the balancing agent is rewarded for modifying it towards the same win rates for all players. To this end, we introduce a novel family of swap-based representations to increase robustness towards playability. We show that this approach is capable to teach an agent how to alter a level for balancing better and faster than plain PCGRL. In addition, by analyzing the agent's swapping behavior, we can draw conclusions about which tile types influence the balancing most. We test and show our results using the Neural MMO (NMMO) environment in a competitive two-player setting.
翻译:在竞争性双人游戏关卡平衡过程中,非对称性关卡尤其需要大量人工操作与测试。本文提出一种基于最近引入的PCGRL(基于强化学习的程序化内容生成)框架的自动化平衡架构,用于处理基于瓦片的关卡。该架构分为三部分:(1)关卡生成器,(2)平衡智能体,以及(3)奖励建模模拟。通过在模拟器中反复运行关卡,平衡智能体通过修改关卡至所有玩家具有相同胜率而获得奖励。为此,我们引入一类新型基于交换的表征方法,以增强对可玩性的鲁棒性。实验表明,该方法能比标准PCGRL更快、更有效地训练智能体调整关卡平衡。此外,通过分析智能体的交换行为,我们可推断出对平衡性影响最大的瓦片类型。我们采用神经多智能体环境(Neural MMO, NMMO)在竞争性双人场景下进行测试并展示了结果。