This paper presents an empirical exploration of non-transitivity in perfect-information games, specifically focusing on Xiangqi, a traditional Chinese board game comparable in game-tree complexity to chess and shogi. By analyzing over 10,000 records of human Xiangqi play, we highlight the existence of both transitive and non-transitive elements within the game's strategic structure. To address non-transitivity, we introduce the JiangJun algorithm, an innovative combination of Monte-Carlo Tree Search (MCTS) and Policy Space Response Oracles (PSRO) designed to approximate a Nash equilibrium. We evaluate the algorithm empirically using a WeChat mini program and achieve a Master level with a 99.41\% win rate against human players. The algorithm's effectiveness in overcoming non-transitivity is confirmed by a plethora of metrics, such as relative population performance and visualization results. Our project site is available at \url{https://sites.google.com/view/jiangjun-site/}.
翻译:本文对完美信息博弈中的非传递性进行了实证探索,特别聚焦于中国象棋——一种博弈树复杂度与国际象棋和将棋相当的传统棋盘游戏。通过分析超过10,000局人类中国象棋对局记录,我们揭示了游戏策略结构中同时存在可传递与非传递性要素。为应对非传递性,我们提出将军算法,该算法创新性地结合了蒙特卡洛树搜索与策略空间响应预言机,旨在逼近纳什均衡。我们利用微信小程序对算法进行了实证评估,最终达到大师水平,对人类玩家取得了99.41%的胜率。算法在克服非传递性方面的有效性通过相对种群表现、可视化结果等多种指标得到验证。我们的项目网站访问地址为:\url{https://sites.google.com/view/jiangjun-site/}。