Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear time-invariant (LTI) ODEs, turning multi-step forecasts into sparse matrix multiplication. Though there exists a variety of learning approaches, they usually lack crucial learning-theoretic guarantees, making the behavior of the obtained models with increasing data and dimensionality unclear. We address the aforementioned by deriving a universal Koopman-invariant reproducing kernel Hilbert space (RKHS) that solely spans transformations into LTI dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization error bounds under weaker assumptions than existing work. Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors in RKHS.
翻译:许多用于决策的机器学习方法(例如强化学习)依赖于模拟器或预测模型来预测感兴趣量的时间演化,例如智能体的状态或策略的奖励。此类复杂现象的预测通常由高度非线性的动力系统描述,这使得它们在基于优化的决策中的应用颇具挑战性。科普曼算子理论通过将预测表征为线性时不变常微分方程,将多步预测转化为稀疏矩阵乘法,从而为解决这一问题提供了有利的范式。尽管存在多种学习方法,但它们通常缺乏关键的学习理论保证,使得所得模型在数据和维度增加时的行为不明确。我们通过推导一个通用的科普曼不变再生核希尔伯特空间(RKHS)来解决上述问题,该空间仅涵盖到线性时不变动力系统的变换。由此产生的科普曼核回归(KKR)框架能够利用函数逼近中的统计学习工具,在比现有工作更弱的假设下获得新的收敛结果和泛化误差界。我们的实验表明,与RKHS中的科普曼算子和序列数据预测器相比,该方法具有更优越的预测性能。