Recurrent neural networks (RNNs) are known to be universal approximators of dynamic systems under fairly mild and general assumptions, making them good tools to process temporal information. However, RNNs usually suffer from the issues of vanishing and exploding gradients in the standard RNN training. Reservoir computing (RC), a special RNN where the recurrent weights are randomized and left untrained, has been introduced to overcome these issues and has demonstrated superior empirical performance in fields as diverse as natural language processing and wireless communications especially in scenarios where training samples are extremely limited. On the contrary, the theoretical grounding to support this observed performance has not been fully developed at the same pace. In this work, we show that RNNs can provide universal approximation of linear time-invariant (LTI) systems. Specifically, we show that RC can universally approximate a general LTI system. We present a clear signal processing interpretation of RC and utilize this understanding in the problem of simulating a generic LTI system through RC. Under this setup, we analytically characterize the optimal probability distribution function for generating the recurrent weights of the underlying RNN of the RC. We provide extensive numerical evaluations to validate the optimality of the derived optimum distribution of the recurrent weights of the RC for the LTI system simulation problem. Our work results in clear signal processing-based model interpretability of RC and provides theoretical explanation for the power of randomness in setting instead of training RC's recurrent weights. It further provides a complete optimum analytical characterization for the untrained recurrent weights, marking an important step towards explainable machine learning (XML) which is extremely important for applications where training samples are limited.
翻译:循环神经网络(RNN)在相当温和且普遍的假设下被认为具有动态系统的通用逼近能力,因而成为处理时序信息的有效工具。然而,标准RNN训练中常出现梯度消失与梯度爆炸问题。储层计算(RC)作为一种特殊的RNN,其循环权重被随机化且无需训练,被提出以克服这些问题,并在自然语言处理、无线通信等领域(尤其是训练样本极其有限的场景中)展现出优越的实证性能。然而,支撑这一观测性能的理论基础尚未同步完善。本文证明RNN能够实现线性时不变(LTI)系统的通用逼近。具体而言,我们证明RC可以通用地逼近一般LTI系统。我们提出RC清晰的信号处理解释,并利用这一理解解决通过RC仿真通用LTI系统的问题。在此框架下,我们解析地刻画了生成RC底层RNN循环权重的优化概率密度函数。通过大量数值评估,验证了针对LTI系统仿真问题所推导的RC循环权重最优分布的优效性。本研究基于信号处理实现了RC模型的可解释性,并为使用随机化而非训练RC循环权重的策略提供了理论依据,同时完整地给出了未训练循环权重的解析最优刻画,标志着可解释机器学习(XML)领域的重要进展——这对于训练样本受限的应用场景至关重要。