Conventional notions of generalization often fail to describe the ability of learned models to capture meaningful information from dynamical data. A neural network that learns complex dynamics with a small test error may still fail to reproduce its \emph{physical} behavior, including associated statistical moments and Lyapunov exponents. To address this gap, we propose an ergodic theoretic approach to generalization of complex dynamical models learned from time series data. Our main contribution is to define and analyze generalization of a broad suite of neural representations of classes of ergodic systems, including chaotic systems, in a way that captures emulating underlying invariant, physical measures. Our results provide theoretical justification for why regression methods for generators of dynamical systems (Neural ODEs) fail to generalize, and why their statistical accuracy improves upon adding Jacobian information during training. We verify our results on a number of ergodic chaotic systems and neural network parameterizations, including MLPs, ResNets, Fourier Neural layers, and RNNs.
翻译:传统泛化概念往往无法描述学习模型从动态数据中捕捉有意义信息的能力。即使神经网络以较小的测试误差学习复杂动态,仍可能无法复现其\emph{物理}行为,包括相关统计矩和李雅普诺夫指数。为弥补这一差距,我们提出一种遍历理论方法,用于分析从时间序列数据学习的复杂动态模型的泛化能力。我们的主要贡献在于:针对包括混沌系统在内的遍历系统类别,定义并分析各类神经表示(如神经网络参数化)的泛化特性,从而实现对底层不变物理测度的有效模拟。我们的研究结果为动态系统生成器(神经常微分方程)回归方法为何无法泛化提供了理论依据,并阐明了在训练过程中加入雅可比信息如何提升其统计准确性。我们在多种遍历混沌系统及神经网络参数化(包括MLP、ResNet、傅里叶神经层和RNN)上验证了理论结果。