Leveraging Input Convex Neural Networks (ICNNs), ICNN-based Model Predictive Control (MPC) successfully attains globally optimal solutions by upholding convexity within the MPC framework. However, current ICNN architectures encounter the issue of vanishing/exploding gradients, which limits their ability to serve as deep neural networks for complex tasks. Additionally, the current neural network-based MPC, including conventional neural network-based MPC and ICNN-based MPC, faces slower convergence speed when compared to MPC based on first-principles models. In this study, we leverage the principles of ICNNs to propose a novel Input Convex LSTM for Lyapunov-based MPC, with the specific goal of reducing convergence time and mitigating the vanishing/exploding gradient problem while ensuring closed-loop stability. From a simulation study of a nonlinear chemical reactor, we observed a mitigation of vanishing/exploding gradient problem and a reduction in convergence time, with a percentage decrease of 46.7%, 31.3%, and 20.2% compared to baseline plain RNN, plain LSTM, and Input Convex Recurrent Neural Networks, respectively.
翻译:利用输入凸神经网络(ICNN),基于ICNN的模型预测控制(MPC)通过在MPC框架内保持凸性成功获取全局最优解。然而,当前ICNN架构存在梯度消失/爆炸问题,限制了其作为深度神经网络处理复杂任务的能力。此外,相较于基于第一性原理模型的MPC,现有基于神经网络的MPC(包括传统神经网络MPC与ICNN-MPC)都存在收敛速度较慢的问题。本研究借鉴ICNN原理,提出一种用于李雅普诺夫MPC的新型输入凸LSTM,旨在降低收敛时间并缓解梯度消失/爆炸问题,同时确保闭环稳定性。通过对非线性化学反应器开展仿真研究,我们观察到梯度消失/爆炸问题得到缓解,收敛时间亦有缩短,相较于基线普通RNN、普通LSTM及输入凸循环神经网络,分别实现了46.7%、31.3%和20.2%的降幅。