Recurrent neural networks are a successful neural architecture for many time-dependent problems, including time series analysis, forecasting, and modeling of dynamical systems. Training such networks with backpropagation through time is a notoriously difficult problem because their loss gradients tend to explode or vanish. In this contribution, we introduce a computational approach to construct all weights and biases of a recurrent neural network without using gradient-based methods. The approach is based on a combination of random feature networks and Koopman operator theory for dynamical systems. The hidden parameters of a single recurrent block are sampled at random, while the outer weights are constructed using extended dynamic mode decomposition. This approach alleviates all problems with backpropagation commonly related to recurrent networks. The connection to Koopman operator theory also allows us to start using results in this area to analyze recurrent neural networks. In computational experiments on time series, forecasting for chaotic dynamical systems, and control problems, as well as on weather data, we observe that the training time and forecasting accuracy of the recurrent neural networks we construct are improved when compared to commonly used gradient-based methods.
翻译:循环神经网络是一种在许多时间相关问题中取得成功的神经架构,包括时间序列分析、预测和动力系统建模。通过时间反向传播训练此类网络是一个众所周知的难题,因为其损失梯度往往爆炸或消失。在本研究中,我们提出了一种无需使用基于梯度的方法来构建循环神经网络所有权重和偏置的计算方法。该方法结合了随机特征网络和动力系统的Koopman算子理论。单个循环块的隐藏参数通过随机采样生成,而外部权重则通过扩展动态模态分解构建。这种方法缓解了通常与循环神经网络相关的所有反向传播问题。与Koopman算子理论的联系还使我们能够开始利用该领域的成果来分析循环神经网络。在时间序列计算实验、混沌动力系统预测与控制问题以及气象数据实验中,我们观察到所构建循环神经网络的训练时间和预测精度相较于常用的基于梯度方法均有所提升。