This paper investigates how deep multi-agent reinforcement learning can enable the scalable and privacy-preserving coordination of residential energy flexibility. The coordination of distributed resources such as electric vehicles and heating will be critical to the successful integration of large shares of renewable energy in our electricity grid and, thus, to help mitigate climate change. The pre-learning of individual reinforcement learning policies can enable distributed control with no sharing of personal data required during execution. However, previous approaches for multi-agent reinforcement learning-based distributed energy resources coordination impose an ever greater training computational burden as the size of the system increases. We therefore adopt a deep multi-agent actor-critic method which uses a \emph{centralised but factored critic} to rehearse coordination ahead of execution. Results show that coordination is achieved at scale, with minimal information and communication infrastructure requirements, no interference with daily activities, and privacy protection. Significant savings are obtained for energy users, the distribution network and greenhouse gas emissions. Moreover, training times are nearly 40 times shorter than with a previous state-of-the-art reinforcement learning approach without the factored critic for 30 homes.
翻译:本文研究了深度多智能体强化学习如何实现居民能源灵活性的可扩展且隐私保护的协调。分布式资源(如电动汽车和供暖系统)的协调对于将大规模可再生能源成功整合到电网中、进而助力缓解气候变化至关重要。通过预学习个体强化学习策略,可在执行过程中无需共享个人数据的情况下实现分布式控制。然而,现有基于多智能体强化学习的分布式能源协调方法会随系统规模扩大而急剧增加训练计算负担。为此,我们采用一种深度多智能体演员-评论家方法,该方法通过“集中式但分解的评论家”在执行前进行协调排练。结果表明,该方法能以最少的信息与通信基础设施需求实现规模化协调,且不干扰日常活动并保护隐私。能源用户、配电网及温室气体排放均可获得显著效益。此外,与未采用分解评论家且针对30户家庭的现有最优强化学习方法相比,训练时间缩短近40倍。