Historically applied exclusively to perfect information games, depth-limited search with value functions has been key to recent advances in AI for imperfect information games. Most prominent approaches with strong theoretical guarantees require subgame decomposition - a process in which a subgame is computed from public information and player beliefs. However, subgame decomposition can itself require non-trivial computations, and its tractability depends on the existence of efficient algorithms for either full enumeration or generation of the histories that form the root of the subgame. Despite this, no formal analysis of the tractability of such computations has been established in prior work, and application domains have often consisted of games, such as poker, for which enumeration is trivial on modern hardware. Applying these ideas to more complex domains requires understanding their cost. In this work, we introduce and analyze the computational aspects and tractability of filtering histories for subgame decomposition. We show that constructing a single history from the root of the subgame is generally intractable, and then provide a necessary and sufficient condition for efficient enumeration. We also introduce a novel Markov Chain Monte Carlo-based generation algorithm for trick-taking card games - a domain where enumeration is often prohibitively expensive. Our experiments demonstrate its improved scalability in the trick-taking card game Oh Hell. These contributions clarify when and how depth-limited search via subgame decomposition can be an effective tool for sequential decision-making in imperfect information settings.
翻译:历史上,深度受限搜索与价值函数仅被应用于完备信息博弈,但已成为近期不完备信息博弈人工智能突破的关键。多数具有强理论保证的主流方法需进行子博弈分解——即根据公共信息和玩家信念计算子博弈的过程。然而,子博弈分解本身可能涉及非平凡计算,其可解性依赖于存在高效的算法,用于完整枚举或生成构成子博弈根节点的历史。尽管如此,已有工作尚未对此类计算的可解性建立形式化分析,且应用领域通常限于扑克等在现代硬件上可轻松完成枚举的博弈。将此类思想拓展至更复杂领域需理解其计算成本。本文引入并分析了子博弈分解中历史过滤的计算层面与可解性。我们证明:从子博弈根节点构建单一历史通常是难解的,进而给出了高效枚举的充要条件。针对枚举成本通常过高的吃墩牌类博弈,我们还提出了一种新型基于马尔可夫链蒙特卡洛的生成算法。实验表明,该算法在吃墩牌类博弈"噢!地狱"中具有更优的可扩展性。这些贡献阐明了何时以及如何通过子博弈分解实现的深度受限搜索,能成为不完备信息设置下序列决策的有效工具。