In recent years, deep reinforcement learning has emerged as a technique to solve closed-loop flow control problems. Employing simulation-based environments in reinforcement learning enables a priori end-to-end optimization of the control system, provides a virtual testbed for safety-critical control applications, and allows to gain a deep understanding of the control mechanisms. While reinforcement learning has been applied successfully in a number of rather simple flow control benchmarks, a major bottleneck toward real-world applications is the high computational cost and turnaround time of flow simulations. In this contribution, we demonstrate the benefits of model-based reinforcement learning for flow control applications. Specifically, we optimize the policy by alternating between trajectories sampled from flow simulations and trajectories sampled from an ensemble of environment models. The model-based learning reduces the overall training time by up to $85\%$ for the fluidic pinball test case. Even larger savings are expected for more demanding flow simulations.
翻译:近年来,深度强化学习已成为解决闭环流动控制问题的一种技术。在强化学习中采用基于模拟的环境,能够实现控制系统的先验端到端优化,为安全关键的控制应用提供虚拟测试平台,并有助于深入理解控制机制。尽管强化学习已在多个较为简单的流动控制基准测试中成功应用,但向实际应用推广的主要瓶颈在于流场模拟的高昂计算成本和较长周转时间。本研究展示了基于模型的强化学习在流动控制应用中的优势。具体而言,我们通过交替使用来自流场模拟的轨迹和来自环境模型集成中的轨迹来优化策略。在流体弹球测试案例中,基于模型的学习将总训练时间减少了多达85%。可以预期,对于计算需求更高的流场模拟,能实现更大幅度的效率提升。