Recent advancements in algorithms for sequential decision-making under imperfect information have shown remarkable success in large games such as limit- and no-limit poker. These algorithms traditionally formalize the games using the extensive-form game formalism, which, as we show, while theoretically sound, is memory-inefficient and computationally intensive in practice. To mitigate these challenges, a popular workaround involves using a specialized representation based on player specific information-state trees. However, as we show, this alternative significantly narrows the set of games that can be represented efficiently. In this study, we identify the set of large games on which modern algorithms have been benchmarked as being naturally represented by Sequential Bayesian Games. We elucidate the critical differences between extensive-form game and sequential Bayesian game representations, both theoretically and empirically. We further argue that the impressive experimental results often cited in the literature may be skewed, as they frequently stem from testing these algorithms only on this restricted class of games. By understanding these nuances, we aim to guide future research in developing more universally applicable and efficient algorithms for sequential decision-making under imperfect information.
翻译:近期,在不完美信息条件下的顺序决策算法方面取得了显著进展,尤其是在限注和无限制扑克等大型博弈中表现出色。这些算法传统上采用扩展形式博弈的形式化框架。然而,我们证明,这一框架虽然在理论上合理,但在实践中存在内存效率低下与计算密集的问题。为缓解这些挑战,一种常见的变通方法是基于玩家特定的信息状态树使用专门的表示方法。然而,我们证明,这种替代方案极大地缩小了能够高效表示的博弈集合。在本研究中,我们指出现代算法所基准测试的大量博弈,其自然表示应为顺序贝叶斯博弈。我们从理论与实证两个层面阐明了扩展形式博弈表示与顺序贝叶斯博弈表示之间的关键差异。此外,我们认为文献中常引用的令人瞩目的实验结果可能存在偏差,因为这些结果往往仅源于在此受限博弈类上测试这些算法。通过理解这些细微差异,我们旨在为未来研究开发更普遍适用且高效的、在不完美信息条件下进行顺序决策的算法提供指导。