We study stochastic zero-sum games on graphs, which are prevalent tools to model decision-making in presence of an antagonistic opponent in a random environment. In this setting, an important question is the one of strategy complexity: what kinds of strategies are sufficient or required to play optimally (e.g., randomization or memory requirements)? Our contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs. First, we show that objectives for which pure AIFM strategies suffice to play optimally also admit pure AIFM subgame perfect strategies. Second, we show that we can reduce the study of objectives for which pure AIFM strategies suffice in two-player stochastic games to the easier study of one-player stochastic games (i.e., Markov decision processes). Third, we characterize the sufficiency of AIFM strategies through two intuitive properties of objectives. This work extends a line of research started on deterministic games to stochastic ones.
翻译:我们研究图上的随机零和博弈,这是在随机环境中对抗性对手存在时进行决策建模的常用工具。在此背景下,一个关键问题涉及策略复杂度:何种策略足以或必需用于实现最优决策(例如随机化或记忆需求)?我们的贡献在于深化对博弈无关有限记忆(AIFM)确定性的理解,即研究需要记忆但仅取决于博弈图有限参数的收益目标。首先,我们证明纯AIFM策略足以实现最优决策的收益目标,同样存在纯AIFM子博弈完美策略。其次,我们证明可将双人随机博弈中纯AIFM策略充分性的研究,简化为更易分析的单人随机博弈(即马尔可夫决策过程)。第三,我们通过收益目标的两种直观性质刻画了AIFM策略的充分性。本研究将始于确定性博弈的研究体系拓展至随机博弈领域。