This paper proposes a new game search algorithm, PN-MCTS, that combines Monte-Carlo Tree Search (MCTS) and Proof-Number Search (PNS). These two algorithms have been successfully applied for decision making in a range of domains. We define three areas where the additional knowledge provided by the proof and disproof numbers gathered in MCTS trees might be used: final move selection, solving subtrees, and the UCT formula. We test all possible combinations on different time settings, playing against vanilla UCT MCTS on several games: Lines of Action ($7$$\times$$7$ and $8$$\times$$8$), MiniShogi, Knightthrough, Awari, and Gomoku. Furthermore, we extend this new algorithm to properly address games with draws, like Awari, by adding an additional layer of PNS on top of the MCTS tree. The experiments show that PN-MCTS confidently outperforms MCTS in 5 out of 6 game domains (all except Gomoku), achieving win rates up to 96.2% for Lines of Action.
翻译:本文提出了一种新的博弈搜索算法——PN-MCTS,该算法融合了蒙特卡洛树搜索(MCTS)与证明数搜索(PNS)。这两种算法已成功应用于多个领域的决策制定。我们定义了三个可通过MCTS树中收集的证明数与反证数提供额外知识的应用方向:最终走子选择、子树求解以及UCT公式。我们在不同时间设置下,针对多种棋类游戏(包括行动线棋($7$$\times$$7$与$8$$\times$$8$)、迷你将棋、骑士穿越、阿瓦里棋和五子棋)测试了所有可能的组合,并与原始UCT MCTS进行对弈。此外,我们还扩展了该新算法,通过向MCTS树顶层添加一层PNS,以妥善处理存在平局的游戏(如阿瓦里棋)。实验结果表明,PN-MCTS在6个游戏域中的5个(除五子棋外)显著优于MCTS,在行动线棋中胜率高达96.2%。