This paper proposes a new game-search algorithm, PN-MCTS, which combines Monte-Carlo Tree Search (MCTS) and Proof-Number Search (PNS). These two algorithms have been successfully applied for decision making in a range of domains. We define three areas where the additional knowledge provided by the proof and disproof numbers gathered in MCTS trees might be used: final move selection, solving subtrees, and the UCB1 selection mechanism. We test all possible combinations on different time settings, playing against vanilla UCT on several games: Lines of Action ($7$$\times$$7$ and $8$$\times$$8$ board sizes), MiniShogi, Knightthrough, and Awari. Furthermore, we extend this new algorithm to properly address games with draws, like Awari, by adding an additional layer of PNS on top of the MCTS tree. The experiments show that PN-MCTS confidently outperforms MCTS in all tested game domains, achieving win rates up to 96.2\% for Lines of Action.
翻译:本文提出了一种新的博弈搜索算法PN-MCTS,该算法结合了蒙特卡洛树搜索(MCTS)和证明数搜索(PNS)。这两种算法已成功应用于多个领域的决策问题。我们从三个方面定义了MCTS树中积累的证明数与反证数所提供的附加知识可能被应用的场景:最终走子选择、子问题求解以及UCB1选择机制。我们在不同时间设置下测试了所有可能的组合,并在多种棋类游戏(包含$7$$\times$$7$和$8$$\times$$8$棋盘规模的行动线、MiniShogi、骑士穿越棋及Awari)中与基础UCT算法进行对抗。此外,我们通过在该算法中为MCTS树顶层增加额外的PNS层,将其扩展至能妥善处理存在平局的游戏(如Awari)。实验表明,PN-MCTS在所有测试的游戏领域中均显著优于MCTS,在行动线游戏中取得了高达96.2%的胜率。