We study a multi-agent decision problem in population games, where agents select from multiple available strategies and continually revise their selections based on the payoffs associated with these strategies. Unlike conventional population game formulations, we consider a scenario where agents must estimate the payoffs through local measurements and communication with their neighbors. By employing task allocation games -- dynamic extensions of conventional population games -- we examine how errors in payoff estimation by individual agents affect the convergence of the strategy revision process. Our main contribution is an analysis of how estimation errors impact the convergence of the agents' strategy profile to equilibrium. Based on the analytical results, we propose a design for a time-varying strategy revision rate to guarantee convergence. Simulation studies illustrate how the proposed method for updating the revision rate facilitates convergence to equilibrium.
翻译:我们研究群体博弈中的多智能体决策问题,其中智能体从多个可用策略中进行选择,并根据这些策略的收益持续修正其选择。与传统群体博弈模型不同,我们考虑智能体必须通过本地测量和与邻居通信来估计收益的场景。通过采用任务分配博弈——传统群体博弈的动态扩展——我们研究了个体智能体的收益估计误差如何影响策略修正过程的收敛性。我们的主要贡献在于分析了估计误差如何影响智能体策略分布向均衡的收敛。基于分析结果,我们提出了一种时变策略修正率的设计以保证收敛。仿真研究说明了所提出的修正率更新方法如何促进向均衡的收敛。