Extensive-form games (EFGs) are a standard model for sequential decision-making in games. A fundamental and typically implicit assumption in EFGs is that players always have access to all of their actions at every decision point. However, in many realistic settings, certain actions might be unavailable during game-play due to exogenous stochasticity, hindering the expressivity of the standard EFG model. Given a `base' EFG, we formalize a model that allows for actions to be stochastically restricted, leading to a corresponding Extensive-Form Games with Stochastic Action Sets (EFGSAS). In EFGSAS, we derive an expansion procedure that results in an equivalent EFG, thus showing that standard strategy formalisms could require exponentially-large representations. However, under an appropriate independence assumption, we show that compact strategy representations polynomial in the size of the base EFG exist. Computationally, we introduce an algorithm called SI-CFR that minimizes sleeping internal regret, converging to Nash equilibria with high probability in two-player zero-sum EFGSAS. Finally, we utilize a stochastic approximation procedure to recover compact representations of Nash equilibria, utilizing only the iterates of SI-CFR.
翻译:扩展式博弈(Extensive-Form Games,EFGs)是建模博弈中序贯决策的标准模型。该模型通常隐含一个基本假设:玩家在每个决策点始终能够使用所有可用动作。然而,在众多现实场景中,某些动作可能在博弈过程中因外生随机性而不可用,从而限制了标准EFG模型的表达能力。基于一个“基础”EFG,我们形式化了一个允许动作受到随机限制的模型,由此得到对应的具有随机动作集的扩展式博弈(Extensive-Form Games with Stochastic Action Sets,EFGSAS)。在EFGSAS中,我们推导出一种展开过程,该过程可得到一个等价的EFG,从而表明标准策略形式化表示可能需要指数级大小的表示。然而,在适当的独立性假设下,我们证明存在以基础EFG规模为多项式的紧凑策略表示。在计算方面,我们提出一种称为SI-CFR的算法,它能最小化睡眠内部遗憾,并在两人零和EFGSAS中以高概率收敛到纳什均衡。最后,我们利用一种随机逼近过程,仅通过SI-CFR的迭代结果来恢复纳什均衡的紧凑表示。