The theory of Koopman operators allows to deploy non-parametric machine learning algorithms to predict and analyze complex dynamical systems. Estimators such as principal component regression (PCR) or reduced rank regression (RRR) in kernel spaces can be shown to provably learn Koopman operators from finite empirical observations of the system's time evolution. Scaling these approaches to very long trajectories is a challenge and requires introducing suitable approximations to make computations feasible. In this paper, we boost the efficiency of different kernel-based Koopman operator estimators using random projections (sketching). We derive, implement and test the new "sketched" estimators with extensive experiments on synthetic and large-scale molecular dynamics datasets. Further, we establish non asymptotic error bounds giving a sharp characterization of the trade-offs between statistical learning rates and computational efficiency. Our empirical and theoretical analysis shows that the proposed estimators provide a sound and efficient way to learn large scale dynamical systems. In particular our experiments indicate that the proposed estimators retain the same accuracy of PCR or RRR, while being much faster.
翻译:库普曼算子理论允许部署非参数机器学习算法来预测和分析复杂动力系统。诸如主成分回归(PCR)或核空间中的降秩回归(RRR)等估计量可以通过系统时间演化的有限经验观测可证明地学习库普曼算子。将这些方法扩展到非常长的轨迹是一个挑战,需要引入合适的近似以使计算可行。在本文中,我们利用随机投影(草图)提升了不同基于核的库普曼算子估计器的效率。我们推导、实现并测试了新的“草图”估计器,并在合成和大规模分子动力学数据集上进行了广泛实验。此外,我们建立了非渐近误差界,对统计学习率与计算效率之间的权衡给出了精确刻画。我们的实证和理论分析表明,所提出的估计器为学习大规模动力系统提供了一种可靠且高效的方法。特别地,我们的实验表明,所提出的估计器在保持与PCR或RRR相同精度的同时,速度更快。