Optimal control of diffusion processes is intimately connected to the problem of solving certain Hamilton-Jacobi-Bellman equations. Building on recent machine learning inspired approaches towards high-dimensional PDEs, we investigate the potential of $\textit{iterative diffusion optimisation}$ techniques, in particular considering applications in importance sampling and rare event simulation, and focusing on problems without diffusion control, with linearly controlled drift and running costs that depend quadratically on the control. More generally, our methods apply to nonlinear parabolic PDEs with a certain shift invariance. The choice of an appropriate loss function being a central element in the algorithmic design, we develop a principled framework based on divergences between path measures, encompassing various existing methods. Motivated by connections to forward-backward SDEs, we propose and study the novel $\textit{log-variance}$ divergence, showing favourable properties of corresponding Monte Carlo estimators. The promise of the developed approach is exemplified by a range of high-dimensional and metastable numerical examples.
翻译:扩散过程的最优控制与求解特定Hamilton-Jacobi-Bellman方程的问题密切相关。基于近期受机器学习启发的面向高维偏微分方程的方法,我们研究了$\textit{迭代扩散优化}$技术的潜力,特别关注其在重要性采样和稀有事件模拟中的应用,并聚焦于无扩散控制、具有线性受控漂移项且运行成本关于控制二次型的问题。更一般地,我们的方法适用于具有特定平移不变性的非线性抛物型偏微分方程。损失函数的选择是算法设计的核心要素,我们基于路径测度之间的散度建立了一套原则性框架,该框架涵盖了多种现有方法。受前向-后向随机微分方程关联的启发,我们提出并研究了新型的$\textit{对数方差}$散度,证明了相应蒙特卡洛估计量的优良性质。所提出方法的前景通过一系列高维和亚稳态数值算例得到了验证。