This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only control penalty function and constraints are provided. Leveraging the theory of reproducing kernel Hilbert spaces, we introduce novel kernel mean embeddings (KMEs) to identify the Markov transition operators associated with controlled diffusion processes. The KME learning approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions. Thus, unlike traditional dynamic programming methods, our approach exploits the ``kernel trick'' to break the curse of dimensionality. We demonstrate the effectiveness of our method through numerical examples, highlighting its ability to solve a large class of nonlinear optimal control problems.
翻译:本文提出了一种完全数据驱动的方法,用于优化控制由随机扩散表示的非线性仿射控制系统。研究重点在于非线性动力学和阶段成本函数均未知,而仅提供控制惩罚函数和约束条件的情况。利用再生核希尔伯特空间理论,我们引入了新颖的核均值嵌入(KMEs)来识别与受控扩散过程相关的马尔可夫转移算子。KME学习方法与现代凸算子理论下的Hamilton-Jacobi-Bellman递归无缝结合。因此,与传统动态规划方法不同,我们的方法利用“核技巧”打破了维度灾难。我们通过数值算例验证了该方法的有效性,突显了其解决一大类非线性最优控制问题的能力。