无需梯度训练循环神经网络 (Gradient-free training of recurrent neural networks)

Recurrent neural networks are a successful neural architecture for many time-dependent problems, including time series analysis, forecasting, and modeling of dynamical systems. Training such networks with backpropagation through time is a notoriously difficult problem because their loss gradients tend to explode or vanish. In this contribution, we introduce a computational approach to construct all weights and biases of a recurrent neural network without using gradient-based methods. The approach is based on a combination of random feature networks and Koopman operator theory for dynamical systems. The hidden parameters of a single recurrent block are sampled at random, while the outer weights are constructed using extended dynamic mode decomposition. This approach alleviates all problems with backpropagation commonly related to recurrent networks. The connection to Koopman operator theory also allows us to start using results in this area to analyze recurrent neural networks. In computational experiments on time series, forecasting for chaotic dynamical systems, and control problems, as well as on weather data, we observe that the training time and forecasting accuracy of the recurrent neural networks we construct are improved when compared to commonly used gradient-based methods.

翻译：循环神经网络是一种在许多时间相关问题中取得成功的神经架构，包括时间序列分析、预测和动力系统建模。通过时间反向传播训练此类网络是一个众所周知的难题，因为其损失梯度往往爆炸或消失。在本研究中，我们提出了一种无需使用基于梯度的方法来构建循环神经网络所有权重和偏置的计算方法。该方法结合了随机特征网络和动力系统的Koopman算子理论。单个循环块的隐藏参数通过随机采样生成，而外部权重则通过扩展动态模态分解构建。这种方法缓解了通常与循环神经网络相关的所有反向传播问题。与Koopman算子理论的联系还使我们能够开始利用该领域的成果来分析循环神经网络。在时间序列计算实验、混沌动力系统预测与控制问题以及气象数据实验中，我们观察到所构建循环神经网络的训练时间和预测精度相较于常用的基于梯度方法均有所提升。

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日