Non-linear dynamical systems can be handily described by the associated Koopman operator, whose action evolves every observable of the system forward in time. Learning the Koopman operator from data is enabled by a number of algorithms. In this work we present nonasymptotic learning bounds for the Koopman eigenvalues and eigenfunctions estimated by two popular algorithms: Extended Dynamic Mode Decomposition (EDMD) and Reduced Rank Regression (RRR). We focus on time-reversal-invariant Markov chains, implying that the Koopman operator is self-adjoint. This includes important examples of stochastic dynamical systems, notably Langevin dynamics. Our spectral learning bounds are driven by the simultaneous control of the operator norm risk of the estimators and a metric distortion associated to the corresponding eigenfunctions. Our analysis indicates that both algorithms have similar variance, but EDMD suffers from a larger bias which might be detrimental to its learning rate. We further argue that a large metric distortion may lead to spurious eigenvalues, a phenomenon which has been empirically observed, and note that metric distortion can be estimated from data. Numerical experiments complement the theoretical findings.
翻译:非线性动力系统可通过关联的Koopman算子简洁描述,该算子的作用使系统的每个可观测量随时间向前演化。从数据中学习Koopman算子可通过多种算法实现。本研究针对两种流行算法——扩展动态模态分解(EDMD)与降秩回归(RRR)所估计的Koopman本征值和本征函数,给出了非渐近学习界。我们聚焦于时间反演不变马尔可夫链,这意味着Koopman算子为自伴算子。这涵盖了随机动力系统的重要实例,特别是朗格万动力学。我们的谱学习界由估计量的算子范数风险及对应本征函数的度量畸变的同步控制所驱动。分析表明,两种算法具有相似的方差,但EDMD存在更大的偏差,可能对其学习速率不利。我们进一步论证,较大的度量畸变可能导致伪本征值——这一现象已为经验观察所证实,并指出度量畸变可从数据中估计。数值实验对理论发现进行了补充。