Counterfactual Regret Minimization (CFR) and its variants developed based upon Regret Matching (RM) have been considered to be the best method to solve incomplete information extensive form games. In addition to RM and CFR, Fictitious Play (FP) is another equilibrium computation algorithm in normal form games. Previous experience has shown that the convergence rate of FP is slower than RM and FP is difficult to use in extensive form games. However, recent research has made improvements in both issues. Firstly, Abernethy proposed a new FP variant sync FP, which has faster convergence rate than RM+. Secondly, Qi introduced FP into extensive form games and proposed Pure CFR (PCFR). This paper combines these two improvements, resulting in a new algorithm sync PCFR. In our experiment, the convergence rate of sync PCFR is approximately an order of magnitude faster than CFR+ (state-of-the-art algorithm for equilibrium computation in incomplete information extensive form games), while requiring less memory in an iteration.
翻译:反事实遗憾最小化(CFR)及其基于遗憾匹配(RM)的变体被认为是不完全信息扩展式博弈中最优的求解方法。除RM和CFR外,虚拟博弈(FP)是另一种适用于正则式博弈的均衡计算方法。以往经验表明,FP的收敛速度慢于RM,且难以应用于扩展式博弈。然而,近期研究在这两方面均取得了改进。首先,Abernethy提出了一种新的FP变体——同步FP,其收敛速度快于RM+。其次,Qi将FP引入扩展式博弈并提出了纯CFR(PCFR)。本文结合这两项改进,提出了新算法——同步PCFR。实验表明,同步PCFR的收敛速度比CFR+(当前不完全信息扩展式博弈均衡计算的最先进算法)快约一个数量级,且单次迭代所需内存更少。