Inference-time alignment for diffusion models aims to adapt a pre-trained reference diffusion model toward a target distribution without retraining the reference score network, thereby preserving the generative capacity of the reference model while enforcing desired properties at the inference time. A central mechanism for achieving such alignment is guidance, which modifies the sampling dynamics through an additional drift term. In this work, we introduce variationally stable Doob's matching, a novel framework for provable guidance estimation grounded in Doob's $h$-transform. Our approach formulates guidance as the gradient of logarithm of an underlying Doob's $h$-function and employs gradient-regularized regression to simultaneously estimate both the $h$-function and its gradient, resulting in a consistent estimator of the guidance. Theoretically, we establish non-asymptotic convergence rates for the estimated guidance. Moreover, we analyze the resulting controllable diffusion processes and prove non-asymptotic convergence guarantees for the generated distributions in the 2-Wasserstein distance. Finally, we show that variationally stable guidance estimators are adaptive to unknown low dimensionality, effectively mitigating the curse of dimensionality under low-dimensional subspace assumptions.
翻译:扩散模型的推理时对齐旨在将预训练的参考扩散模型适配到目标分布,而无需重新训练参考分数网络,从而在保持参考模型生成能力的同时,在推理时强制执行所需属性。实现这种对齐的核心机制是引导,它通过额外的漂移项来修改采样动态。本文提出变分稳定的Doob匹配,这是一种基于Doob $h$-变换的可证明引导估计新框架。我们的方法将引导公式化为底层Doob $h$-函数对数的梯度,并采用梯度正则化回归同时估计$h$-函数及其梯度,从而得到引导的一致估计量。理论上,我们为估计的引导建立了非渐近收敛速率。此外,我们分析了所得的可控扩散过程,并证明了生成分布在2-Wasserstein距离下的非渐近收敛保证。最后,我们证明了变分稳定的引导估计量能够自适应未知的低维结构,在低维子空间假设下有效缓解维度灾难问题。