We consider the generative problem of sampling from an unknown distribution for which only a sufficiently large number of training samples are available. In this paper, we build on previous work combining Schr\"odinger bridges and Langevin dynamics. A key bottleneck of this approach is the exponential dependence of the required training samples on the dimension, $d$, of the ambient state space. We propose a localization strategy which exploits conditional independence of conditional expectation values. Localization thus replaces a single high-dimensional Schr\"odinger bridge problem by $d$ low-dimensional Schr\"odinger bridge problems over the available training samples. In this context, a connection to multi-head self attention transformer architectures is established. As for the original Schr\"odinger bridge sampling approach, the localized sampler is stable and geometric ergodic. The sampler also naturally extends to conditional sampling and to Bayesian inference. We demonstrate the performance of our proposed scheme through experiments on a Gaussian problem with increasing dimensions, on a temporal stochastic process, and on a stochastic subgrid-scale parametrization conditional sampling problem.
翻译:我们考虑从未知分布中采样的生成问题,其中仅可获得足够数量的训练样本。本文基于先前结合薛定谔桥与朗之万动力学的研究展开工作。该方法的一个关键瓶颈在于所需训练样本数量随环境状态空间维度$d$呈指数级增长。我们提出一种局部化策略,该策略利用条件期望值的条件独立性。局部化方法通过$d$个基于可用训练样本的低维薛定谔桥问题,替代了单个高维薛定谔桥问题。在此框架下,建立了与多头自注意力Transformer架构的关联。与原始薛定谔桥采样方法相同,局部化采样器具有稳定性与几何遍历性。该采样器还可自然扩展至条件采样与贝叶斯推断。我们通过三个实验验证了所提方案的性能:维度递增的高斯问题、时序随机过程问题,以及随机亚网格尺度参数化条件采样问题。