Recent advancements in algorithms for sequential decision-making under imperfect information have shown remarkable success in large games such as limit- and no-limit poker. These algorithms traditionally formalize the games using the extensive-form game formalism, which, as we show, while theoretically sound, is memory-inefficient and computationally intensive in practice. To mitigate these challenges, a popular workaround involves using a specialized representation based on player specific information-state trees. However, as we show, this alternative significantly narrows the set of games that can be represented efficiently. In this study, we identify the set of large games on which modern algorithms have been benchmarked as being naturally represented by Sequential Bayesian Games. We elucidate the critical differences between extensive-form game and sequential Bayesian game representations, both theoretically and empirically. We further argue that the impressive experimental results often cited in the literature may be skewed, as they frequently stem from testing these algorithms only on this restricted class of games. By understanding these nuances, we aim to guide future research in developing more universally applicable and efficient algorithms for sequential decision-making under imperfect information.
翻译:近期在不完全信息条件下的序贯决策算法研究中,在限注及无限注扑克等大型博弈中取得了显著成功。这些算法传统上采用扩展式博弈形式化建模博弈,然而我们指出,这种形式化方法虽然在理论上严谨,但在实际应用中存在内存效率低下和计算密集的问题。为缓解这些挑战,一种流行解决方案是基于玩家特定信息状态树的专用表示法。但我们的研究表明,这一替代方案会显著缩小能够高效表示的博弈集合。在本研究中,我们指出现代算法所基准测试的大型博弈集合天然适合用序贯贝叶斯博弈表示。我们从理论和实证两个层面阐明了扩展式博弈与序贯贝叶斯博弈表示之间的关键差异。我们进一步论证,文献中常引用的令人瞩目的实验结果可能存在偏差,因为它们通常仅基于这类受限博弈集合进行算法测试。通过理解这些细微差异,我们旨在为未来开发更普适且高效的不完全信息序贯决策算法提供指引。