A celebrated connection in the interface of online learning and game theory establishes that players minimizing swap regret converge to correlated equilibria (CE) -- a seminal game-theoretic solution concept. Despite the long history of this problem and the renewed interest it has received in recent years, a basic question remains open: how many iterations are needed to approximate an equilibrium under the usual normal-form representation? In this paper, we provide evidence that existing learning algorithms, such as multiplicative weights update, are close to optimal. In particular, we prove lower bounds for the problem of computing a CE that can be expressed as a uniform mixture of $T$ product distributions -- namely, a uniform $T$-sparse CE; such lower bounds immediately circumscribe (computationally bounded) regret minimization algorithms in games. Our results are obtained in the algorithmic framework put forward by Kothari and Mehta (STOC 2018) in the context of computing Nash equilibria, which consists of the sum-of-squares (SoS) relaxation in conjunction with oracle access to a verification oracle; the goal in that framework is to lower bound either the degree of the SoS relaxation or the number of queries to the verification oracle. Here, we obtain two such hardness results, precluding computing i) uniform $\text{log }n$-sparse CE when $\epsilon =\text{poly}(1/\text{log }n)$ and ii) uniform $n^{1 - o(1)}$-sparse CE when $\epsilon = \text{poly}(1/n)$.
翻译:在线学习与博弈论交叉领域的一个著名关联表明,最小化交换遗憾的玩家会收敛到相关均衡(CE)——这是一个开创性的博弈论解概念。尽管该问题历史悠久,且近年来重新受到关注,但一个基本问题仍未解决:在通常的正规形式表示下,逼近一个均衡需要多少次迭代?本文提供证据表明,现有的学习算法(如乘性权重更新法)已接近最优。具体而言,我们证明了计算可表示为$T$个乘积分布均匀混合的CE——即均匀$T$稀疏CE——问题的下界;这些下界立即界定了博弈中(计算有界的)遗憾最小化算法的能力范围。我们的结果是在Kothari和Mehta(STOC 2018)提出的计算纳什均衡算法框架下获得的,该框架结合了平方和(SoS)松弛与验证预言机访问;该框架的目标是下界SoS松弛的阶数或验证预言机的查询次数。在此,我们获得了两个这样的硬度结果,排除了在以下情况下计算i) 当$\epsilon =\text{poly}(1/\text{log }n)$时的均匀$\text{log }n$稀疏CE,以及ii) 当$\epsilon = \text{poly}(1/n)$时的均匀$n^{1 - o(1)}$稀疏CE的可能性。