We propose a framework for applying reinforcement learning to contextual two-stage stochastic optimization and apply this framework to the problem of energy market bidding of an off-shore wind farm. Reinforcement learning could potentially be used to learn close to optimal solutions for first stage variables of a two-stage stochastic program under different contexts. Under the proposed framework, these solutions would be learned without having to solve the full two-stage stochastic program. We present initial results of training using the DDPG algorithm and present intended future steps to improve performance.
翻译:我们提出了一种将强化学习应用于情境两阶段随机优化的框架,并将其应用于海上风电场能源市场投标问题。强化学习可潜在地用于学习不同情境下两阶段随机规划中第一阶段变量的接近最优解。在该框架下,无需求解完整的两阶段随机规划即可学习这些解。我们展示了使用DDPG算法训练的初步结果,并阐述了未来提升性能的预期步骤。