The optimal dispatch of energy storage systems (ESSs) presents formidable challenges due to the uncertainty introduced by fluctuations in dynamic prices, demand consumption, and renewable-based energy generation. By exploiting the generalization capabilities of deep neural networks (DNNs), deep reinforcement learning (DRL) algorithms can learn good-quality control models that adaptively respond to distribution networks' stochastic nature. However, current DRL algorithms lack the capabilities to enforce operational constraints strictly, often even providing unfeasible control actions. To address this issue, we propose a DRL framework that effectively handles continuous action spaces while strictly enforcing the environments and action space operational constraints during online operation. Firstly, the proposed framework trains an action-value function modeled using DNNs. Subsequently, this action-value function is formulated as a mixed-integer programming (MIP) formulation enabling the consideration of the environment's operational constraints. Comprehensive numerical simulations show the superior performance of the proposed MIP-DRL framework, effectively enforcing all constraints while delivering high-quality dispatch decisions when compared with state-of-the-art DRL algorithms and the optimal solution obtained with a perfect forecast of the stochastic variables.
翻译:储能系统(ESSs)的最优调度因动态价格波动、需求消耗及可再生能源发电的不确定性而面临严峻挑战。通过利用深度神经网络(DNNs)的泛化能力,深度强化学习(DRL)算法可学习高质量控制模型,从而自适应响应配电网络的随机特性。然而,当前DRL算法缺乏严格强制执行运行约束的能力,甚至可能提供不可行的控制动作。为解决这一问题,我们提出一种DRL框架,该框架能有效处理连续动作空间,同时在线运行中严格满足环境与动作空间的运行约束。首先,所提框架训练一个由DNN建模的动作值函数。随后,该动作值函数被表述为混合整数规划(MIP)形式,从而能够考虑环境的运行约束。全面的数值仿真表明,所提出的MIP-DRL框架相较于现有最先进的DRL算法及基于随机变量完美预测的最优解,在严格满足所有约束条件的同时,能够提供高质量的调度决策。