Consider $N$ players and $K$ games taking place simultaneously. Each of these games is modeled as a Tug-of-War (ToW) game where increasing the action of one player decreases the reward for all other players. Each player participates in only one game at any given time. At each time step, a player decides the game in which they wish to participate in and the action they take in that game. Their reward depends on the actions of all players that are in the same game. This system of $K$ games is termed a 'Meta Tug-of-War' (Meta-ToW) game. These games can model scenarios such as power control, distributed task allocation, and activation in sensor networks. We propose the Meta Tug-of-Peace algorithm, a distributed algorithm where the action updates are done using a simple stochastic approximation algorithm, and the decision to switch games is made using an infrequent 1-bit communication between the players. We prove that in Meta-ToW games, our algorithm converges to an equilibrium that satisfies a target Quality of Service reward vector for the players. We then demonstrate the efficacy of our algorithm through simulations for the scenarios mentioned above.
翻译:考虑 $N$ 个玩家同时参与 $K$ 场博弈。每场博弈均建模为拔河(Tug-of-War,ToW)博弈,其中某玩家行动的增加会降低所有其他玩家的收益。每个玩家在任意时刻仅参与一场博弈。在每个时间步,玩家决定其希望参与的博弈及在该博弈中采取的行动,其收益取决于同场博弈中所有玩家的行动。这 $K$ 场博弈构成的系统称为“元拔河”(Meta Tug-of-War,Meta-ToW)博弈。此类博弈可建模功率控制、分布式任务分配及传感器网络激活等场景。我们提出元和平拔河(Meta Tug-of-Peace)算法,该分布式算法采用简单随机逼近算法更新行动,并通过玩家间不频繁的1比特通信做出切换博弈决策。我们证明,在元拔河博弈中,该算法收敛至满足玩家目标服务质量收益向量的均衡。随后通过上述场景的仿真验证了算法的有效性。