Numerical methods for the optimal feedback control of high-dimensional dynamical systems typically suffer from the curse of dimensionality. In the current presentation, we devise a mesh-free data-based approximation method for the value function of optimal control problems, which partially mitigates the dimensionality problem. The method is based on a greedy Hermite kernel interpolation scheme and incorporates context-knowledge by its structure. Especially, the value function surrogate is elegantly enforced to be 0 in the target state, non-negative and constructed as a correction of a linearized model. The algorithm is proposed in a matrix-free way, which circumvents the large-matrix-problem for multivariate Hermite interpolation. For finite time horizons, both convergence of the surrogate to the value function as well as for the surrogate vs. the optimal controlled dynamical system are proven. Experiments support the effectiveness of the scheme, using among others a new academic model that has a scalable dimension and an explicitly given value function. It may also be useful for the community to validate other optimal control approaches.
翻译:高维动力系统最优反馈控制的数值方法通常受制于维数灾难。本文提出了一种无网格的基于数据的近似方法,用于求解最优控制问题的价值函数,该方法可部分缓解维数问题。该方法基于贪心Hermite核插值方案,并通过其结构引入领域知识。特别地,所构建的价值函数替代模型被优雅地约束为目标状态处取零、非负,并作为线性化模型的修正项。算法以无矩阵方式提出,从而规避了多变量Hermite插值中的大矩阵问题。对于有限时间范围,证明了替代模型向价值函数的收敛性,以及替代模型与最优受控动力系统的收敛性。实验验证了该方案的有效性,其中使用了一个具有可扩展维数和显式给定价值函数的新型学术模型。该模型也可能为验证其他最优控制方法提供参考。