We study stochastic zero-sum games on graphs, which are prevalent tools to model decision-making in presence of an antagonistic opponent in a random environment. In this setting, an important question is the one of strategy complexity: what kinds of strategies are sufficient or required to play optimally (e.g., randomization or memory requirements)? Our contributions further the understanding of arena-independent finite-memory (AIFM) determinacy, i.e., the study of objectives for which memory is needed, but in a way that only depends on limited parameters of the game graphs. First, we show that objectives for which pure AIFM strategies suffice to play optimally also admit pure AIFM subgame perfect strategies. Second, we show that we can reduce the study of objectives for which pure AIFM strategies suffice in two-player stochastic games to the easier study of one-player stochastic games (i.e., Markov decision processes). Third, we characterize the sufficiency of AIFM strategies through two intuitive properties of objectives. This work extends a line of research started on deterministic games to stochastic ones.
翻译:我们研究图中的随机零和博弈,这类工具广泛用于建模在随机环境中面对对抗性对手时的决策问题。在此背景下,一个关键问题是策略复杂性:哪些类型的策略足以或必需用于最优博弈(例如,随机化或记忆需求)?我们的贡献进一步加深了对竞技场无关的有限记忆(AIFM)确定性的理解,即研究需要记忆的目标,但仅依赖于博弈图的有限参数。首先,我们证明,纯AIFM策略足以实现最优博弈的目标,也支持纯AIFM子博弈完美策略。其次,我们证明,可以将两玩家随机博弈中纯AIFM策略足够性的研究简化为更易处理的单玩家随机博弈(即马尔可夫决策过程)。第三,我们通过目标的两个直观性质刻画了AIFM策略的充分性。本研究将始于确定性博弈的研究方向拓展至随机博弈。