Leveraging Input Convex Neural Networks (ICNNs), ICNN-based Model Predictive Control (MPC) successfully attains globally optimal solutions by upholding convexity within the MPC framework. However, current ICNN architectures encounter the issue of vanishing gradients, which limits their ability to serve as deep neural networks for complex tasks. Additionally, the current neural network-based MPC, including conventional neural network-based MPC and ICNN-based MPC, faces slower convergence speed when compared to MPC based on first-principles models. In this study, we leverage the principles of ICNNs to propose a novel Input Convex LSTM for Lyapunov-based MPC, with the specific goal of reducing convergence time and mitigating the vanishing gradient problem while ensuring closed-loop stability. From a simulation study of a nonlinear chemical reactor, we observed a mitigation of vanishing gradient problem and a reduction in convergence time, with a percentage decrease of 46.7%, 31.3%, and 20.2% compared to baseline plain RNN, plain LSTM, and Input Convex Recurrent Neural Network, respectively.
翻译:利用输入凸神经网络(ICNNs),基于ICNN的模型预测控制(MPC)通过在MPC框架内保持凸性,成功实现了全局最优解。然而,当前ICNN架构面临梯度消失问题,这限制了其作为深度神经网络处理复杂任务的能力。此外,与基于第一性原理模型的MPC相比,当前的神经网络MPC(包括传统神经网络MPC和ICNN-based MPC)收敛速度较慢。本研究利用ICNN原理,提出了一种新型输入凸LSTM用于李雅普诺夫MPC,旨在减少收敛时间并缓解梯度消失问题,同时确保闭环稳定性。通过对一个非线性化学反应器的仿真研究,我们观察到梯度消失问题得到缓解,收敛时间减少,与基线普通RNN、普通LSTM和输入凸循环神经网络相比,分别减少了46.7%、31.3%和20.2%。