Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear dynamical systems. This makes system analysis and long-term predictions simple -- involving only matrix multiplications. However, the transformation to a linear system is generally non-trivial and unknown, requiring learning-based approaches. While there exists a variety of approaches, they usually lack crucial learning-theoretic guarantees, such that the behavior of the obtained models with increasing data and dimensionality is often unclear. We address the aforementioned by deriving a novel reproducing kernel Hilbert space (RKHS) that solely spans transformations into linear dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization risk bounds under weaker assumptions than existing work. Our numerical experiments indicate advantages over state-of-the-art statistical learning approaches for Koopman-based predictors.
翻译:许多用于决策的机器学习方法(如强化学习)依赖于模拟器或预测模型来预测感兴趣量的时间演化,例如智能体的状态或策略的奖励。这些复杂现象的预测通常由高度非线性的动力系统描述,这使得它们在基于优化的决策制定中的应用极具挑战性。库普曼算子理论通过借助线性动力系统描述预测,为解决这一问题提供了极具价值的范式。这简化了系统分析与长期预测——仅涉及矩阵乘法。然而,向线性系统的转换通常非平凡且未知,需要基于学习的方法。尽管存在多种方法,但它们通常缺乏关键的学习理论保证,因此所获模型在数据量和维度增加时的行为往往不明确。针对上述问题,我们通过推导一个新颖的再生核希尔伯特空间(RKHS)来解决——该空间仅张成到线性动力系统的变换。由此产生的Koopman核回归(KKR)框架能够利用函数逼近中的统计学习工具,在比现有工作更弱的假设下,获得新颖的收敛性结果与泛化风险界限。我们的数值实验表明,在基于库普曼的预测器方面,该方法相较于最先进的统计学习方法具有优势。