Modern distribution matching algorithms for training diffusion or flow models directly prescribe the time evolution of the marginal distributions between two boundary distributions. In this work, we consider a generalized distribution matching setup, where these marginals are only implicitly described as a solution to some task-specific objective function. The problem setup, known as the Generalized Schr\"odinger Bridge (GSB), appears prevalently in many scientific areas both within and without machine learning. We propose Generalized Schr\"odinger Bridge Matching (GSBM), a new matching algorithm inspired by recent advances, generalizing them beyond kinetic energy minimization and to account for task-specific state costs. We show that such a generalization can be cast as solving conditional stochastic optimal control, for which efficient variational approximations can be used, and further debiased with the aid of path integral theory. Compared to prior methods for solving GSB problems, our GSBM algorithm always preserves a feasible transport map between the boundary distributions throughout training, thereby enabling stable convergence and significantly improved scalability. We empirically validate our claims on an extensive suite of experimental setups, including crowd navigation, opinion depolarization, LiDAR manifolds, and image domain transfer. Our work brings new algorithmic opportunities for training diffusion models enhanced with task-specific optimality structures.
翻译:现代用于训练扩散模型或流模型的分布匹配算法,直接规定了两个边界分布间边际分布的时间演化。本文考虑一种广义分布匹配设置,其中这些边际分布仅隐式地描述为某个任务特定目标函数的解。该问题设置被称为广义薛定谔桥(GSB),广泛出现在机器学习领域内外众多科学分支中。我们提出广义薛定谔桥匹配(GSBM)——一种受近期进展启发的新匹配算法,将其推广至超越动能最小化并考虑任务特定状态代价。我们证明,此类广义化可表述为求解条件随机最优控制问题,对此可采用高效变分近似,并借助路径积分理论进一步去偏。相较于先前求解GSB问题的方法,我们的GSBM算法始终在训练过程中保持边界分布间的可行传输映射,从而实现稳定收敛与显著增强的可扩展性。我们通过在包含人群导航、观点去极化、激光雷达流形及图像域迁移的大量实验场景中验证了算法性能。本工作为结合任务特定最优性结构训练扩散模型带来了新的算法机遇。