We study conditional generation in diffusion models under hard constraints, where generated samples must satisfy prescribed events with probability one. Such constraints arise naturally in safety-critical applications and in rare-event simulation, where soft or reward-based guidance methods offer no guarantee of constraint satisfaction. Building on a probabilistic interpretation of diffusion models, we develop a principled conditional diffusion guidance framework based on Doob's h-transform, martingale representation and quadratic variation process. Specifically, the resulting guided dynamics augment a pretrained diffusion with an explicit drift correction involving the logarithmic gradient of a conditioning function, without modifying the pretrained score network. Leveraging martingale and quadratic-variation identities, we propose two novel off-policy learning algorithms based on a martingale loss and a martingale-covariation loss to estimate h and its gradient using only trajectories from the pretrained model. We provide non-asymptotic guarantees for the resulting conditional sampler in both total variation and Wasserstein distances, explicitly characterizing the impact of score approximation and guidance estimation errors. Numerical experiments demonstrate the effectiveness of the proposed methods in enforcing hard constraints and generating rare-event samples.
翻译:本文研究扩散模型在硬约束下的条件生成问题,其中生成样本必须以概率一满足预设事件。此类约束在安全关键应用和罕见事件模拟中自然出现,而基于软约束或奖励的引导方法无法保证约束满足。基于扩散模型的概率解释,我们利用Doob的h变换、鞅表示和二次变差过程,建立了一个原理性的条件扩散引导框架。具体而言,所得引导动态通过包含条件函数对数梯度的显式漂移修正来增强预训练扩散模型,而无需修改预训练得分网络。借助鞅与二次变差恒等式,我们提出了两种基于鞅损失和鞅协变损失的非策略学习算法,仅利用预训练模型的轨迹来估计h及其梯度。我们在全变差距离和Wasserstein距离上为所得条件采样器提供了非渐近保证,明确刻画了得分近似与引导估计误差的影响。数值实验验证了所提方法在强制执行硬约束和生成罕见事件样本方面的有效性。