Leveraging Input Convex Neural Networks (ICNNs), ICNN-based Model Predictive Control (MPC) successfully attains globally optimal solutions by upholding convexity within the MPC framework. However, current ICNN architectures encounter the issue of exploding gradients, which limits their ability to serve as deep neural networks for complex tasks. Additionally, the current neural network-based MPC, including conventional neural network-based MPC and ICNN-based MPC, faces slower convergence speed when compared to MPC based on first-principles models. In this study, we leverage the principles of ICNNs to propose a novel Input Convex LSTM for MPC, with the specific goals of mitigating the exploding gradient problems in current ICNNs and reducing convergence time for NN-based MPC. From a simulation study of a nonlinear chemical reactor, we observed a reduction in convergence time, with a percentage decrease of 46.7%, 31.3%, and 20.2% compared to baseline plain RNN, plain LSTM, and Input Convex RNN, respectively.
翻译:利用输入凸神经网络(ICNN),基于ICNN的模型预测控制(MPC)通过在MPC框架内保持凸性成功获得全局最优解。然而,当前ICNN架构存在梯度爆炸问题,限制了其作为深度神经网络处理复杂任务的能力。此外,当前基于神经网络的MPC(包括传统神经网络MPC和ICNN-MPC)相较于基于机理模型的MPC,收敛速度较慢。本研究借助ICNN原理提出了一种面向MPC的新型输入凸LSTM,旨在缓解当前ICNN的梯度爆炸问题并缩短基于神经网络的MPC的收敛时间。通过对非线性化学反应器的仿真研究,我们发现相较于基线Plain RNN、Plain LSTM和输入凸RNN,收敛时间分别缩短了46.7%、31.3%和20.2%。