Iterative Model-Learning Scheme via Gaussian Processes for Nonlinear Model Predictive Control of (Semi-)Batch Processes

Batch processes are inherently transient and typically nonlinear, motivating nonlinear model predictive control (NMPC). However, adopting NMPC is hindered by the cost and unavailability of dynamic models. Thus, we propose to use Gaussian Processes (GP) in a model-learning NMPC scheme (GP-MLMPC) for batch processes. We initialize the GP-MLMPC using data from a single initial trajectory, e.g., from a PI controller. We iteratively apply the NMPC embedded with GPs to run batches and update the GP with new observations from each iteration, thereby achieving batch-wise improvements. Using uncertainty quantification from the GPs, we formulate chance constraints to enforce safe operation to the required confidence levels. We demonstrate our approach in \textit{silico} on a semi-batch polymerization reactor for tracking and economic objectives over durations of two hours, and the reactor temperature is constrained in a range of $\pm2^\circ C$ around its setpoint. After only four batch iterations, tracking error from the GP-MLMPC scheme converged to a reduction of $83\%$, compared to the initial trajectory. Furthermore, under an economic objective, the GP-MLMPC resulted in a 17-fold increase in final product mass by iteration 8, compared to the initial trajectory. In both cases, the resulting GP-MLMPC performance is on par with the full-model NMPC, which shows that the optimal controller can be learned by the approach. By collecting samples around the optimal trajectory, the GP-MLMPC remains sample-efficient across iterations and achieves quick convergence. Thus, the proposed GP-MLMPC scheme presents a promising data-efficient approach for the control of nonlinear batch processes without mechanistic knowledge.

翻译：间歇过程本质上是瞬态且通常非线性的，这推动了非线性模型预测控制(NMPC)的应用。然而，采用NMPC受限于动态模型的成本高且难以获取。因此，我们提出在间歇过程中使用高斯过程(GP)构建模型学习型NMPC方案(GP-MLMPC)。我们利用单条初始轨迹数据(例如来自PI控制器)初始化GP-MLMPC。通过迭代应用嵌入高斯过程的NMPC运行批次，并利用每次迭代的新观测值更新高斯过程，从而实现批次的逐步改进。基于高斯过程的不确定性量化，我们构建机会约束，以在所需置信水平下确保安全操作。我们在半连续聚合反应器上通过仿真验证了该方法，针对两小时时长内的跟踪目标和经济目标，反应器温度被约束在其设定点$\pm2^\circ C$范围内。仅经过四次批次迭代后，与初始轨迹相比，GP-MLMPC方案的跟踪误差收敛至降低83%。此外，在经济目标下，到第8次迭代时，GP-MLMPC使最终产品质量相比初始轨迹提升了17倍。在两种情况下，GP-MLMPC的性能均与全模型NMPC相当，表明该方法可以学习到最优控制器。通过围绕最优轨迹采集样本，GP-MLMPC在迭代过程中保持样本高效性并实现快速收敛。因此，所提出的GP-MLMPC方案为无需机理知识的非线性间歇过程控制提供了一种有前景的数据高效方法。