In recent years, deep reinforcement learning has emerged as a technique to solve closed-loop flow control problems. Employing simulation-based environments in reinforcement learning enables a priori end-to-end optimization of the control system, provides a virtual testbed for safety-critical control applications, and allows to gain a deep understanding of the control mechanisms. While reinforcement learning has been applied successfully in a number of rather simple flow control benchmarks, a major bottleneck toward real-world applications is the high computational cost and turnaround time of flow simulations. In this contribution, we demonstrate the benefits of model-based reinforcement learning for flow control applications. Specifically, we optimize the policy by alternating between trajectories sampled from flow simulations and trajectories sampled from an ensemble of environment models. The model-based learning reduces the overall training time by up to $85\%$ for the fluidic pinball test case. Even larger savings are expected for more demanding flow simulations.
翻译:近年来,深度强化学习已成为解决闭环流动控制问题的一种技术。在强化学习中使用基于模拟的环境,能够实现控制系统的先验端到端优化,为安全关键的控制应用提供虚拟测试平台,并有助于深入理解控制机制。尽管强化学习已成功应用于一系列相对简单的流动控制基准问题,但模拟的高计算成本和长周转时间仍是其走向实际应用的主要瓶颈。本文中,我们展示了基于模型的强化学习在流动控制应用中的优势。具体而言,我们通过在流模拟生成的轨迹与环境模型集成生成的轨迹之间交替采样来优化策略。在流体力学的弹子球测试案例中,基于模型的学习将总训练时间减少了高达85%。对于更复杂的流模拟,预计将实现更大的节省。