Mean field games (MFGs) have emerged as a powerful framework for modeling interactions in large-scale multi-agent systems. Despite recent advancements in reinforcement learning (RL) for MFGs, existing methods are typically limited to finite spaces or stationary models, hindering their applicability to real-world problems. This paper introduces a novel deep reinforcement learning (DRL) algorithm specifically designed for non-stationary continuous MFGs. The proposed approach builds upon a Fictitious Play (FP) methodology, leveraging DRL for best-response computation and supervised learning for average policy representation. Furthermore, it learns a representation of the time-dependent population distribution using a Conditional Normalizing Flow. To validate the effectiveness of our method, we evaluate it on three different examples of increasing complexity. By addressing critical limitations in scalability and density approximation, this work represents a significant advancement in applying DRL techniques to complex MFG problems, bringing the field closer to real-world multi-agent systems.
翻译:均值场博弈(MFGs)已成为建模大规模多智能体系统交互的强大框架。尽管强化学习(RL)在MFGs领域取得近期进展,现有方法通常局限于有限空间或平稳模型,限制了其在实际问题中的应用。本文提出一种专为非平稳连续MFGs设计的新型深度强化学习(DRL)算法。该方法基于虚拟博弈(FP)框架,利用DRL进行最优响应计算,并采用监督学习实现平均策略表征。此外,通过条件归一化流学习时变群体分布的表示。为验证方法的有效性,我们在三个复杂度递增的示例上进行评估。通过解决可扩展性与密度逼近的关键局限,本工作标志着DRL技术应用于复杂MFG问题的重要进展,推动该领域更贴近现实世界多智能体系统。