The congestion game is a powerful model that encompasses a range of engineering systems such as traffic networks and resource allocation. It describes the behavior of a group of agents who share a common set of $F$ facilities and take actions as subsets with $k$ facilities. In this work, we study the online formulation of congestion games, where agents participate in the game repeatedly and observe feedback with randomness. We propose CongestEXP, a decentralized algorithm that applies the classic exponential weights method. By maintaining weights on the facility level, the regret bound of CongestEXP avoids the exponential dependence on the size of possible facility sets, i.e., $\binom{F}{k} \approx F^k$, and scales only linearly with $F$. Specifically, we show that CongestEXP attains a regret upper bound of $O(kF\sqrt{T})$ for every individual player, where $T$ is the time horizon. On the other hand, exploiting the exponential growth of weights enables CongestEXP to achieve a fast convergence rate. If a strict Nash equilibrium exists, we show that CongestEXP can converge to the strict Nash policy almost exponentially fast in $O(F\exp(-t^{1-\alpha}))$, where $t$ is the number of iterations and $\alpha \in (1/2, 1)$.
翻译:拥塞博弈是一种强大的模型,涵盖了交通网络、资源分配等一系列工程系统。它描述了一组智能体的行为,这些智能体共享 $F$ 个设施的公共集合,并将动作定义为包含 $k$ 个设施的子集。本文研究拥塞博弈的在线形式,其中智能体重复参与博弈,并观察带有随机性的反馈。我们提出了一种去中心化算法 CongestEXP,该算法应用了经典的指数加权方法。通过在设施层面维护权重,CongestEXP 的遗憾界避免了对可能设施集合大小的指数依赖(即 $\binom{F}{k} \approx F^k$),而仅与 $F$ 呈线性关系。具体而言,我们证明对于每个独立玩家,CongestEXP 实现了 $O(kF\sqrt{T})$ 的遗憾上界,其中 $T$ 为时间范围。另一方面,利用权重的指数增长使 CongestEXP 能够实现快速收敛速率。若存在严格纳什均衡,我们证明 CongestEXP 能以近乎指数的速率收敛到严格纳什策略,收敛速率为 $O(F\exp(-t^{1-\alpha}))$,其中 $t$ 为迭代次数,$\alpha \in (1/2, 1)$。