This paper studies convergence rates for some value function approximations that arise in a collection of reproducing kernel Hilbert spaces (RKHS) $H(\Omega)$. By casting an optimal control problem in a specific class of native spaces, strong rates of convergence are derived for the operator equation that enables offline approximations that appear in policy iteration. Explicit upper bounds on error in value function and controller approximations are derived in terms of power function $\mathcal{P}_{H,N}$ for the space of finite dimensional approximants $H_N$ in the native space $H(\Omega)$. These bounds are geometric in nature and refine some well-known, now classical results concerning convergence of approximations of value functions.
翻译:本文研究了一类再生核希尔伯特空间(RKHS)$H(\Omega)$中出现的某些值函数逼近的收敛速度。通过将最优控制问题置于特定类别的原生空间中,我们推导了算子方程的强收敛速率,该方程使得策略迭代中出现的离线的逼近成为可能。我们以原生空间$H(\Omega)$中的有限维逼近空间$H_N$的幂函数$\mathcal{P}_{H,N}$为基准,给出了值函数和控制器逼近误差的显式上界。这些上界本质上是几何性质的,并完善了一些关于值函数逼近收敛性的经典已知结果。