The solution to a stochastic optimal control problem can be determined by computing the value function from a discretization of the associated Hamilton-Jacobi-Bellman equation. Alternatively, the problem can be reformulated in terms of a pair of forward-backward SDEs, which makes Monte-Carlo techniques applicable. More recently, the problem has also been viewed from the perspective of forward and reverse time SDEs and their associated Fokker-Planck equations. This approach is closely related to techniques used in diffusion-based generative models. Forward and reverse time formulations express the value function as the ratio of two probability density functions; one stemming from a forward McKean-Vlasov SDE and another one from a reverse McKean-Vlasov SDE. In this paper, we extend this approach to a more general class of stochastic optimal control problems and combine it with ensemble Kalman filter type and diffusion map approximation techniques in order to obtain efficient and robust particle-based algorithms.
翻译:随机最优控制问题的求解可通过离散化关联的Hamilton-Jacobi-Bellman方程计算价值函数来实现。另一种方法是将问题重构为一组前向-后向随机微分方程,这使得蒙特卡洛技术得以应用。近年来,该问题还被从前向与反向时间随机微分方程及其对应的福克-普朗克方程视角进行研究,该方法与基于扩散的生成模型技术密切相关。前向与反向时间公式将价值函数表示为两个概率密度函数之比:一个源自前向McKean-Vlasov随机微分方程,另一个则来自反向McKean-Vlasov随机微分方程。本文将该方法拓展至更广泛的随机最优控制问题类别,并结合集成卡尔曼滤波类型方法及扩散映射逼近技术,以构建高效稳健的粒子型算法。