In this paper, a novel mechanism-driven reinforcement learning framework is proposed for airfoil shape optimization. To validate the framework, a reward function is designed and analyzed, from which the equivalence between the maximizing the cumulative reward and achieving the optimization objectives is guaranteed theoretically. To establish a quality exploration, and to obtain an accurate reward from the environment, an efficient solver for steady Euler equations is employed in the reinforcement learning method. The solver utilizes the B\'ezier curve to describe the shape of the airfoil, and a Newton-geometric multigrid method for the solution. In particular, a dual-weighted residual-based h-adaptive method is used for efficient calculation of target functional. To effectively streamline the airfoil shape during the deformation process, we introduce the Laplacian smoothing, and propose a B\'ezier fitting strategy, which not only remits mesh tangling but also guarantees a precise manipulation of the geometry. In addition, a neural network architecture is designed based on an attention mechanism to make the learning process more sensitive to the minor change of the airfoil geometry. Numerical experiments demonstrate that our framework can handle the optimization problem with hundreds of design variables. It is worth mentioning that, prior to this work, there are limited works combining such high-fidelity partial differential equatons framework with advanced reinforcement learning algorithms for design problems with such high dimensionality.
翻译:本文提出了一种新颖的机理驱动强化学习框架用于翼型形状优化。为验证该框架,设计并分析了一种奖励函数,从理论上保证了最大化累积奖励与实现优化目标之间的等价性。为建立高质量的探索并从环境中获得准确的奖励,该强化学习方法采用了一种高效的定常欧拉方程求解器。该求解器利用Bézier曲线描述翼型形状,并采用牛顿-几何多重网格方法进行求解。特别地,采用基于对偶加权残差的h自适应方法以实现目标泛函的高效计算。为在变形过程中有效优化翼型形状,我们引入了拉普拉斯平滑技术,并提出了一种Bézier拟合策略,该策略不仅能缓解网格缠绕问题,还能保证对几何形状的精确操控。此外,基于注意力机制设计了神经网络架构,使学习过程对翼型几何的微小变化更为敏感。数值实验表明,本框架能够处理具有数百个设计变量的优化问题。值得指出的是,在本工作之前,将如此高精度的偏微分方程框架与先进强化学习算法相结合,以解决如此高维度的设计问题的研究尚属有限。