In systems governed by nonlinear partial differential equations such as fluid flows, the design of state estimators such as Kalman filters relies on a reduced-order model (ROM) that projects the original high-dimensional dynamics onto a computationally tractable low-dimensional space. However, ROMs are prone to large errors, which negatively affects the performance of the estimator. Here, we introduce the reinforcement learning reduced-order estimator (RL-ROE), a ROM-based estimator in which the correction term that takes in the measurements is given by a nonlinear policy trained through reinforcement learning. The nonlinearity of the policy enables the RL-ROE to compensate efficiently for errors of the ROM, while still taking advantage of the imperfect knowledge of the dynamics. Using examples involving the Burgers and Navier-Stokes equations, we show that in the limit of very few sensors, the trained RL-ROE outperforms a Kalman filter designed using the same ROM. Moreover, it yields accurate high-dimensional state estimates for trajectories corresponding to various physical parameter values, without direct knowledge of the latter.
翻译:在诸如流体流动等非线性偏微分方程控制的系统中,卡尔曼滤波器等状态估计器的设计需要依赖降阶模型将原始高维动力学映射至计算可处理的低维空间。然而,降阶模型容易产生较大误差,这会负面影响估计器的性能。本文提出强化学习降阶估计器——一种基于降阶模型的估计方法,其接收测量值的修正项由通过强化学习训练的非线性策略给出。该策略的非线性特性使RL-ROE能高效补偿降阶模型的误差,同时仍然利用了对动力学的不完全认知。通过Burgers方程和Navier-Stokes方程实例,我们证明在传感器数量极少的限制条件下,训练后的RL-ROE性能优于基于相同降阶模型设计的卡尔曼滤波器。此外,该方法无需直接知晓各种物理参数值,即可为对应不同参数值的轨迹提供精确的高维状态估计。