Leveraging Input Convex Neural Networks (ICNNs), ICNN-based Model Predictive Control (MPC) successfully attains globally optimal solutions by upholding convexity within the MPC framework. However, current ICNN architectures encounter the issue of vanishing/exploding gradients, which limits their ability to serve as deep neural networks for complex tasks. Additionally, the current neural network-based MPC, including conventional neural network-based MPC and ICNN-based MPC, faces slower convergence speed when compared to MPC based on first-principles models. In this study, we leverage the principles of ICNNs to propose a novel Input Convex LSTM for Lyapunov-based MPC, with the specific goal of reducing convergence time and mitigating the vanishing/exploding gradient problem while ensuring closed-loop stability. From a simulation study of a nonlinear chemical reactor, we observed a mitigation of vanishing/exploding gradient problem and a reduction in convergence time, with a percentage decrease of 46.7%, 31.3%, and 20.2% compared to baseline plain RNN, plain LSTM, and Input Convex Recurrent Neural Network, respectively.
翻译:利用输入凸神经网络(ICNN)的ICNN-based模型预测控制(MPC)通过保持MPC框架内的凸性,成功获得了全局最优解。然而,现有ICNN架构存在梯度消失/爆炸问题,限制了其作为深度神经网络处理复杂任务的能力。此外,当前基于神经网络的MPC(包括传统神经网络MPC与ICNN-based MPC)相比基于第一性原理模型的MPC,收敛速度较慢。本研究借鉴ICNN原理提出了一种面向李雅普诺夫MPC的新型输入凸LSTM,旨在减少收敛时间、缓解梯度消失/爆炸问题,同时确保闭环稳定性。通过对非线性化学反应器的仿真研究,我们发现梯度消失/爆炸问题得到缓解,收敛时间相比基线普通RNN、普通LSTM和输入凸递归神经网络分别减少了46.7%、31.3%和20.2%。