We consider a general class of multi-agent games in networks, namely the generalized vertex coloring games (G-VCGs), inspired by real-life applications of the venue selection problem in events planning. Certain utility responding to the contemporary coloring assignment will be received by each agent under some particular mechanism, who, striving to maximize his own utility, is restricted to local information thus self-organizing when choosing another color. Our focus is on maximizing some utilitarian-looking welfare objective function concerning the cumulative utilities across the network in a decentralized fashion. Firstly, we investigate on a special class of the G-VCGs, namely Identical Preference VCGs (IP-VCGs) which recovers the rudimentary work by \cite{chaudhuri2008network}. We reveal its convergence even under a completely greedy policy and completely synchronous settings, with a stochastic bound on the converging rate provided. Secondly, regarding the general G-VCGs, a greediness-preserved Metropolis-Hasting based policy is proposed for each agent to initiate with the limited information and its optimality under asynchronous settings is proved using theories from the regular perturbed Markov processes. The policy was also empirically witnessed to be robust under independently synchronous settings. Thirdly, in the spirit of ``robust coloring'', we include an expected loss term in our objective function to balance between the utilities and robustness. An optimal coloring for this robust welfare optimization would be derived through a second-stage MH-policy driven algorithm. Simulation experiments are given to showcase the efficiency of our proposed strategy.
翻译:本文研究网络环境下的一类广义多智能体博弈问题——广义顶点染色博弈(Generalized Vertex Coloring Games, G-VCGs),该问题源于活动策划中场地选择问题的实际应用。在特定机制下,每个智能体根据当前染色分配获得对应效用,其在追求自身效用最大化的过程中受限于局部信息,因此在选择新颜色时呈现自组织特性。本文聚焦于以去中心化方式最大化网络累积效用相关的功利型福利目标函数。首先,我们研究G-VCGs的一个特例——同质偏好顶点染色博弈(Identical Preference VCGs, IP-VCGs),该模型复现了\cite{chaudhuri2008network}的基础工作。结果表明,即使在完全贪婪策略与完全同步设置下,该模型仍能收敛,并给出了收敛速率的随机上界。其次,针对一般G-VCGs,我们提出了一种保留贪婪性的基于Metropolis-Hasting的策略,使各智能体能在有限信息下启动,并利用正则扰动马尔可夫过程理论证明了其在异步设置下的最优性。该策略在独立同步设置下亦通过实验验证了其鲁棒性。最后,遵循“鲁棒染色”思想,我们在目标函数中引入期望损失项以平衡效用与鲁棒性,通过第二阶段MH策略驱动算法获得该鲁棒福利优化的最优染色方案。仿真实验展示了所提策略的有效性。