The theory of Koopman operators allows to deploy non-parametric machine learning algorithms to predict and analyze complex dynamical systems. Estimators such as principal component regression (PCR) or reduced rank regression (RRR) in kernel spaces can be shown to provably learn Koopman operators from finite empirical observations of the system's time evolution. Scaling these approaches to very long trajectories is a challenge and requires introducing suitable approximations to make computations feasible. In this paper, we boost the efficiency of different kernel-based Koopman operator estimators using random projections (sketching). We derive, implement and test the new "sketched" estimators with extensive experiments on synthetic and large-scale molecular dynamics datasets. Further, we establish non asymptotic error bounds giving a sharp characterization of the trade-offs between statistical learning rates and computational efficiency. Our empirical and theoretical analysis shows that the proposed estimators provide a sound and efficient way to learn large scale dynamical systems. In particular our experiments indicate that the proposed estimators retain the same accuracy of PCR or RRR, while being much faster.
翻译:Koopman算子理论允许部署非参数机器学习算法来预测和分析复杂动力系统。主成分回归(PCR)或核空间中的降秩回归(RRR)等估计器可被证明从系统时间演化的有限经验观测中学习Koopman算子。将这些方法扩展到极长轨迹面临挑战,需要引入适当近似以使计算可行。本文通过使用随机投影(草图)提升了不同基于核的Koopman算子估计器的效率。我们推导、实现并测试了新的"草图"估计器,并在合成数据集和大规模分子动力学数据集上进行了广泛实验。此外,我们建立了非渐近误差界,对统计学习速率与计算效率之间的权衡进行了精确刻画。我们的实证和理论分析表明,所提出的估计器为学习大规模动力系统提供了一种合理且高效的方法。特别地,实验表明,所提出的估计器在保持与PCR或RRR相同精度的同时,速度显著提升。