The paper proposes the use of structured neural networks for reinforcement learning based nonlinear adaptive control. The focus is on partially observable systems, with separate neural networks for the state and feedforward observer and the state feedback and feedforward controller. The observer dynamics are modelled by recurrent neural networks while a standard network is used for the controller. As discussed in the paper, this leads to a separation of the observer dynamics to the recurrent neural network part, and the state feedback to the feedback and feedforward network. The structured approach reduces the computational complexity and gives the reinforcement learning based controller an {\em understandable} structure as compared to when one single neural network is used. As shown by simulation the proposed structure has the additional and main advantage that the training becomes significantly faster. Two ways to include feedforward structure are presented, one related to state feedback control and one related to classical feedforward control. The latter method introduces further structure with a separate recurrent neural network that processes only the measured disturbance. When evaluated with simulation on a nonlinear cascaded double tank process, the method with most structure performs the best, with excellent feedforward disturbance rejection gains.
翻译:本文提出利用结构化神经网络实现基于强化学习的非线性自适应控制。重点研究部分可观测系统,分别采用独立神经网络构建状态与前馈观测器,以及状态反馈与前馈控制器。观测器动态由递归神经网络建模,而控制器采用标准网络结构。如文中所讨论,该方法将观测器动态分离至递归神经网络部分,状态反馈则归入反馈与前馈网络。相较于单网结构,这种结构化方法降低了计算复杂度,并为基于强化学习的控制器赋予了可理解的架构。仿真结果表明,所提结构具有额外且主要优势——训练速度显著提升。本文提出了两种前馈结构集成方式,一种关联状态反馈控制,另一种关联经典前馈控制。后者通过额外递归神经网络仅处理测量扰动,进一步强化结构设计。在非线性级联双水箱过程的仿真评估中,结构化程度最高的方法表现最优,展现出优异的前馈扰动抑制性能。