Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear time-invariant (LTI) ODEs -- turning multi-step forecasting into sparse matrix multiplications. Though there exists a variety of learning approaches, they usually lack crucial learning-theoretic guarantees, making the behavior of the obtained models with increasing data and dimensionality unclear. We address the aforementioned by deriving a novel reproducing kernel Hilbert space (RKHS) over trajectories that solely spans transformations into LTI dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization error bounds under weaker assumptions than existing work. Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors in RKHS.
翻译:许多用于决策的机器学习方法(如强化学习)依赖模拟器或预测模型来预测感兴趣量的时间演化,例如智能体的状态或策略的奖励。这类复杂现象的预测通常由高度非线性的动力系统描述,这使得它们在基于优化的决策制定中应用困难。Koopman算子理论通过利用线性时不变常微分方程描述预测,将多步预测转化为稀疏矩阵乘法,为解决该问题提供了有利范式。尽管已有多种学习方法,但它们通常缺乏关键的学习理论保证,使得所获模型随数据量和维度的增加行为不明确。针对上述问题,我们通过推导轨迹上的新型再生核希尔伯特空间来解决——该空间仅张成线性时不变动力系统的变换。由此产生的Koopman核回归框架能够利用函数逼近的统计学习工具,在比现有研究更宽松的假设下获得新的收敛结果和泛化误差界。实验表明,与RKHS中的Koopman算子和时序数据预测器相比,本方法具有更优的预测性能。