Deep Reinforcement Learning combined with Fictitious Play shows impressive results on many benchmark games, most of which are, however, single-stage. In contrast, real-world decision making problems may consist of multiple stages, where the observation spaces and the action spaces can be completely different across stages. We study a two-stage strategy card game Legends of Code and Magic and propose an end-to-end policy to address the difficulties that arise in multi-stage game. We also propose an optimistic smooth fictitious play algorithm to find the Nash Equilibrium for the two-player game. Our approach wins double championships of COG2022 competition. Extensive studies verify and show the advancement of our approach.
翻译:深度强化学习与虚拟博弈相结合在许多基准游戏中展现出显著成果,然而这些游戏大多为单阶段形式。相比之下,现实世界中的决策问题往往包含多个阶段,且不同阶段的观测空间与动作空间可能完全不同。本研究针对双阶段策略卡牌游戏Legends of Code and Magic展开探索,提出一种端到端策略以应对多阶段游戏中的困难。同时,我们提出一种乐观光滑虚拟博弈算法,用于求解双人游戏的纳什均衡。该方法在COG2022竞赛中斩获双料冠军。大量实验验证并展示了我们方法的先进性。