We characterize the Schr\"odinger bridge problems by a family of Mckean-Vlasov stochastic control problems with no terminal time distribution constraint. In doing so, we use the theory of Hilbert space embeddings of probability measures and then describe the constraint as penalty terms defined by the maximum mean discrepancy in the control problems. A sequence of the probability laws of the state processes resulting from $\epsilon$-optimal controls converges to a unique solution of the Schr\"odinger's problem under mild conditions on given initial and terminal time distributions and an underlying diffusion process. We propose a neural SDE based deep learning algorithm for the Mckean-Vlasov stochastic control problems. Several numerical experiments validate our methods.
翻译:我们通过一系列无终端时间分布约束的McKean-Vlasov随机控制问题来刻画薛定谔桥问题。为此,我们利用概率测度的希尔伯特空间嵌入理论,将约束描述为控制问题中由最大均值差异定义的惩罚项。在给定的初始和终端时间分布以及底层扩散过程满足温和条件时,由$\epsilon$-最优控制产生的状态过程概率律序列收敛到薛定谔问题的唯一解。我们提出了一种基于神经SDE的深度学习算法来解决McKean-Vlasov随机控制问题。多项数值实验验证了该方法的有效性。