Extensive-form games (EFGs) provide a powerful framework for modeling sequential decision making, capturing strategic interaction under imperfect information, chance events, and temporal structure. Most positive algorithmic and theoretical results for EFGs assume perfect recall, where players remember all past information and actions. We study the increasingly relevant setting of imperfect-recall EFGs (IREFGs), where players may forget parts of their history or previously acquired information, and where equilibrium computation is provably hard. We propose sum-of-squares (SOS) hierarchies for computing ex-ante optimal strategies in single-player IREFGs and Nash equilibria in multi-player IREFGs, working over behavioral strategies. Our theoretical results show that (i) these hierarchies converge asymptotically, (ii) under genericity assumptions, the convergence is finite, and (iii) in single-player non-absentminded IREFGs, convergence occurs at a finite level determined by the number of information sets. Finally, we introduce the new classes of (SOS)-concave and (SOS)-monotone IREFGs, and show that in the single-player setting the SOS hierarchy converges at the first level, enabling equilibrium computation with a single semidefinite program (SDP).
翻译:扩展式博弈为建模序贯决策提供了强大框架,能够刻画不完全信息、随机事件与时间结构下的策略交互。针对扩展式博弈的大多数积极算法与理论结果均假设完美回忆,即参与者能记住所有历史信息与行动。本研究聚焦于日益重要的不完美回忆扩展式博弈场景——参与者可能遗忘部分历史或先前获取的信息,且均衡计算已被证明是NP难问题。我们提出基于行为策略的和平方层次方法,用于计算单参与者不完美回忆扩展式博弈的事前最优策略及多参与者情形下的纳什均衡。理论结果表明:(1)该层次体系具有渐近收敛性;(2)在一般性假设下收敛为有限步;(3)在单参与者非心不在焉型不完美回忆扩展式博弈中,收敛发生于由信息集数量确定的有限层级。最后,我们提出(SOS)-凹与(SOS)-单调这两类新型不完美回忆扩展式博弈,并证明在单参与者情形下和平方层次于第一层级即可收敛,仅需单个半定规划即可实现均衡计算。