This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only a control penalty function and constraints are provided. To this end, we embed state probability densities into a reproducing kernel Hilbert space (RKHS) to leverage recent advances in operator regression, thereby identifying Markov transition operators associated with controlled diffusion processes. This operator learning approach integrates naturally with convex operator-theoretic Hamilton-Jacobi-Bellman recursions that scale linearly with state dimensionality, effectively solving a wide range of nonlinear optimal control problems. Numerical results demonstrate its ability to address diverse nonlinear control tasks, including the depth regulation of an autonomous underwater vehicle.
翻译:本文提出了一种完全数据驱动的方法,用于控制由随机扩散表示的非线性仿射控制系统的优化控制。研究重点在于非线性动力学和阶段成本函数均未知,仅提供控制惩罚函数和约束条件的情况。为此,我们将状态概率密度嵌入再生核希尔伯特空间,以利用算子回归的最新进展,从而识别与受控扩散过程相关的马尔可夫转移算子。该算子学习方法自然地与凸算子理论哈密顿-雅可比-贝尔曼递推相结合,其计算复杂度随状态维度线性增长,有效解决了广泛类型的非线性最优控制问题。数值结果表明,该方法能够处理多种非线性控制任务,包括自主水下航行器的深度调节。