In this paper, we derive a PAC-Bayes bound on the generalisation gap, in a supervised time-series setting for a special class of discrete-time non-linear dynamical systems. This class includes stable recurrent neural networks (RNN), and the motivation for this work was its application to RNNs. In order to achieve the results, we impose some stability constraints, on the allowed models. Here, stability is understood in the sense of dynamical systems. For RNNs, these stability conditions can be expressed in terms of conditions on the weights. We assume the processes involved are essentially bounded and the loss functions are Lipschitz. The proposed bound on the generalisation gap depends on the mixing coefficient of the data distribution, and the essential supremum of the data. Furthermore, the bound converges to zero as the dataset size increases. In this paper, we 1) formalize the learning problem, 2) derive a PAC-Bayesian error bound for such systems, 3) discuss various consequences of this error bound, and 4) show an illustrative example, with discussions on computing the proposed bound. Unlike other available bounds the derived bound holds for non i.i.d. data (time-series) and it does not grow with the number of steps of the RNN.
翻译:本文针对一类特殊的离散时间非线性动力系统,在监督时间序列设定下推导了其泛化差距的PAC-Bayes界。这类系统包含稳定递归神经网络(RNN),而本文的研究动机正是将其应用于RNN。为得到相关结果,我们对允许的模型施加了若干稳定性约束。这里的"稳定性"采用动力系统理论中的定义。对于RNN而言,这些稳定性条件可表示为对网络权重的约束条件。我们假设所涉及的随机过程本质有界,且损失函数满足Lipschitz条件。所提出的泛化差距界依赖于数据分布的混合系数及数据的本质上确界。此外,随着数据集规模增大,该界收敛至零。本文主要完成以下工作:1)形式化学习问题;2)推导这类系统的PAC-Bayes误差界;3)讨论该误差界的多种推论;4)通过实例演示并探讨所提界的计算方法。与现有其他界相比,本推导的界适用于非独立同分布数据(时间序列),且不会随RNN时间步数增长。