This paper introduces a novel, open-source MARL simulation framework for studying implicit cooperation in LEMs, modeled as a decentralized partially observable Markov decision process and implemented as a Gymnasium environment for MARL. Our framework features a modular market platform with plug-and-play clearing mechanisms, physically constrained agent models (including battery storage), a realistic grid network, and a comprehensive analytics suite to evaluate emergent coordination. The main contribution is a novel method to foster implicit cooperation, where agents' observations and rewards are enhanced with system-level key performance indicators to enable them to independently learn strategies that benefit the entire system and aim for collectively beneficial outcomes without explicit communication. Through representative case studies (available in a dedicated GitHub repository in https://github.com/salazarna/marlem, we show the framework's ability to analyze how different market configurations (such as varying storage deployment) impact system performance. This illustrates its potential to facilitate emergent coordination, improve market efficiency, and strengthen grid stability. The proposed simulation framework is a flexible, extensible, and reproducible tool for researchers and practitioners to design, test, and validate strategies for future intelligent, decentralized energy systems.
翻译:本文提出了一种新颖的开源多智能体强化学习仿真框架,用于研究本地能源市场中的隐性合作问题。该框架被建模为一个去中心化部分可观测马尔可夫决策过程,并实现为适用于多智能体强化学习的Gymnasium环境。我们的框架包含一个模块化的市场平台,具备即插即用的出清机制、物理约束的智能体模型(包括电池储能)、一个现实的电网网络以及一个全面的分析套件,用于评估涌现的协同行为。其主要贡献在于提出了一种促进隐性合作的新方法,该方法通过将系统级关键性能指标融入智能体的观测与奖励函数中,使智能体能够独立学习有益于整个系统的策略,并追求集体利益最大化的结果,而无需显式通信。通过代表性案例研究(可在专用GitHub仓库https://github.com/salazarna/marlem获取),我们展示了该框架分析不同市场配置(如变化的储能部署)如何影响系统性能的能力。这说明了其在促进涌现协同、提升市场效率以及增强电网稳定性方面的潜力。所提出的仿真框架是一个灵活、可扩展且可复现的工具,可供研究人员和从业者用于设计、测试和验证未来智能去中心化能源系统的策略。