We characterize the Schr\"odinger bridge problems by a family of Mckean-Vlasov stochastic control problems with no terminal time distribution constraint. In doing so, we use the theory of Hilbert space embeddings of probability measures and then describe the constraint as penalty terms defined by the maximum mean discrepancy in the control problems. A sequence of the probability laws of the state processes resulting from $\epsilon$-optimal controls converges to a unique solution of the Schr\"odinger's problem under mild conditions on given initial and terminal time distributions and an underlying diffusion process. We propose a neural SDE based deep learning algorithm for the Mckean-Vlasov stochastic control problems. Several numerical experiments validate our methods.
翻译:我们将薛定谔桥问题刻画为一类无终端时刻分布约束的McKean-Vlasov随机控制问题族。为此,我们利用概率测度的希尔伯特空间嵌入理论,将约束条件描述为控制问题中由最大均值差异定义的惩罚项。在给定初始与终端分布及底层扩散过程的温和条件下,由ϵ-最优控制产生的状态过程概率律序列收敛至薛定谔问题的唯一解。我们提出一种基于神经随机微分方程的深度学习算法求解McKean-Vlasov随机控制问题。多组数值实验验证了该方法的有效性。