Pretrained diffusion models serve as frozen teachers feeding downstream pipelines such as text-to-3D, single-step distillation, and data attribution. The teacher gradients these pipelines consume are Monte Carlo (MC) expectations over noise levels and Gaussian noise samples; their estimator variance dominates compute cost because each draw requires expensive upstream work (rendering, simulation, encoding). We introduce CARV, a compute-aware variance-accounting framework that motivates a hierarchical MC estimator: amortize the expensive upstream computation over cheap diffusion-noise resamples, sharpened by timestep importance sampling and a stratified-inverse-CDF construction. In our text-to-3D distillation and attribution experiments, CARV delivers 2-3x effective compute multipliers (most from amortized reuse; ~25% additional from IS+stratification) without changing the objective; in single-step distillation, the same techniques cut gradient variance by an order of magnitude but do not improve downstream FID, marking the regime where MC variance is no longer the bottleneck.
翻译:预训练的扩散模型作为冻结的教师模型,为下游流程(如文本到3D、单步蒸馏和数据归因)提供支持。这些流程消耗的教师梯度是噪声水平和高斯噪声样本的蒙特卡洛(MC)期望;其估计方差主导计算成本,因为每次采样都需要进行昂贵的上游工作(渲染、模拟、编码)。我们提出CARV,一个计算感知的方差核算框架,该框架驱动了一种分层MC估计器:通过廉价扩散噪声重采样分摊昂贵的上游计算,并通过时间步重要性采样和分层逆CDF构造增强。在我们的文本到3D蒸馏和归因实验中,CARV在不改变目标的情况下实现了2-3倍的有效计算乘数(主要来自分摊重用;约25%额外来自重要性采样+分层);在单步蒸馏中,相同技术将梯度方差降低一个数量级,但未改善下游FID,这表明此时MC方差不再是瓶颈区域。