Although the field of multi-agent reinforcement learning (MARL) has made considerable progress in the last years, solving systems with a large number of agents remains a hard challenge. Graphon mean field games (GMFGs) enable the scalable analysis of MARL problems that are otherwise intractable. By the mathematical structure of graphons, this approach is limited to dense graphs which are insufficient to describe many real-world networks such as power law graphs. Our paper introduces a novel formulation of GMFGs, called LPGMFGs, which leverages the graph theoretical concept of $L^p$ graphons and provides a machine learning tool to efficiently and accurately approximate solutions for sparse network problems. This especially includes power law networks which are empirically observed in various application areas and cannot be captured by standard graphons. We derive theoretical existence and convergence guarantees and give empirical examples that demonstrate the accuracy of our learning approach for systems with many agents. Furthermore, we extend the Online Mirror Descent (OMD) learning algorithm to our setup to accelerate learning speed, empirically show its capabilities, and conduct a theoretical analysis using the novel concept of smoothed step graphons. In general, we provide a scalable, mathematically well-founded machine learning approach to a large class of otherwise intractable problems of great relevance in numerous research fields.
翻译:尽管多智能体强化学习(MARL)领域近年来取得了显著进展,但解决包含大量智能体的系统仍然是一个严峻的挑战。图论平均场博弈(GMFGs)能够对原本难以处理的MARL问题进行可扩展分析。然而,由于图论的数学结构限制,该方法仅适用于稠密图,无法描述许多现实网络(如幂律图)。本文提出了一种新的GMFG形式化方法——LPGMFG,它利用$L^p$图论这一图论概念,提供了一种机器学习工具,能够高效且精确地逼近稀疏网络问题的解。这尤其包括在多个应用领域中经验观测到但标准图论无法捕捉的幂律网络。我们推导了理论存在性和收敛性保证,并通过实证示例展示了该方法在包含大量智能体的系统中的准确性。此外,我们将在线镜像下降(OMD)学习算法扩展到我们的框架中,以加速学习速度,通过实验展示了其能力,并利用平滑步进图论的新概念进行了理论分析。总体而言,我们为大量在众多研究领域中具有重要意义的原本难以处理的问题,提供了一种可扩展、数学基础坚实的机器学习方法。