We introduce anytime constraints to the multi-agent setting with the corresponding solution concept being anytime-constrained equilibrium (ACE). Then, we present a comprehensive theory of anytime-constrained Markov games, which includes (1) a computational characterization of feasible policies, (2) a fixed-parameter tractable algorithm for computing ACE, and (3) a polynomial-time algorithm for approximately computing feasible ACE. Since computing a feasible policy is NP-hard even for two-player zero-sum games, our approximation guarantees are the best possible under worst-case analysis. We also develop the first theory of efficient computation for action-constrained Markov games, which may be of independent interest.
翻译:本文将任意时间约束引入多智能体场景,并提出相应的解决方案概念——任意时间约束均衡(ACE)。随后,我们建立了任意时间约束马尔可夫博弈的完整理论体系,其中包括:(1)可行策略的计算特性描述;(2)计算ACE的固定参数可处理算法;(3)近似计算可行ACE的多项式时间算法。由于即使在两人零和博弈中计算可行策略也是NP难问题,我们的近似保证在最坏情况分析下已达到理论最优。此外,我们还首次建立了动作约束马尔可夫博弈的高效计算理论,该理论可能具有独立的研究价值。