This paper investigates how deep multi-agent reinforcement learning can enable the scalable and privacy-preserving coordination of residential energy flexibility. The coordination of distributed resources such as electric vehicles and heating will be critical to the successful integration of large shares of renewable energy in our electricity grid and, thus, to help mitigate climate change. The pre-learning of individual reinforcement learning policies can enable distributed control with no sharing of personal data required during execution. However, previous approaches for multi-agent reinforcement learning-based distributed energy resources coordination impose an ever greater training computational burden as the size of the system increases. We therefore adopt a deep multi-agent actor-critic method which uses a \emph{centralised but factored critic} to rehearse coordination ahead of execution. Results show that coordination is achieved at scale, with minimal information and communication infrastructure requirements, no interference with daily activities, and privacy protection. Significant savings are obtained for energy users, the distribution network and greenhouse gas emissions. Moreover, training times are nearly 40 times shorter than with a previous state-of-the-art reinforcement learning approach without the factored critic for 30 homes.
翻译:本文研究深度多智能体强化学习如何实现可扩展且保护隐私的居民能源灵活性协调。电动汽车和供暖等分布式资源的协调对于将可再生能源大比例成功整合到电网中至关重要,从而有助于减缓气候变化。通过预学习个体强化学习策略,可在执行过程中无需共享个人数据即可实现分布式控制。然而,现有基于多智能体强化学习的分布式能源资源协调方法随着系统规模扩大,训练计算负担日益加重。为此,我们采用深度多智能体演员-评论家方法,该方法利用"集中式但分解的评论家"在执行前预演协调动作。结果表明,该方法能以最少的信息和通信基础设施需求实现规模化的协调,不干扰日常活动且保护隐私。能源用户、配电网及温室气体排放均获得显著效益。此外,与未采用分解评论家的现有最先进强化学习方法相比,系统可对30户住宅实现近40倍更快的训练速度。