In this work, we propose a novel framework to enable diffusion models to adapt their generation quality based on real-time network bandwidth constraints. Traditional diffusion models produce high-fidelity images by performing a fixed number of denoising steps, regardless of downstream transmission limitations. However, in practical cloud-to-device scenarios, limited bandwidth often necessitates heavy compression, leading to loss of fine textures and wasted computation. To address this, we introduce a joint end-to-end training strategy where the diffusion model is conditioned on a target quality level derived from the available bandwidth. During training, the model learns to adaptively modulate the denoising process, enabling early-stop sampling that maintains perceptual quality appropriate to the target transmission condition. Our method requires minimal architectural changes and leverages a lightweight quality embedding to guide the denoising trajectory. Experimental results demonstrate that our approach significantly improves the visual fidelity of bandwidth-adapted generations compared to naive early-stopping, offering a promising solution for efficient image delivery in bandwidth-constrained environments. Code is available at: https://github.com/xzhang9308/BADiff.
翻译:本文提出一种新颖框架,使扩散模型能够根据实时网络带宽约束自适应调整生成质量。传统扩散模型通过固定数量的去噪步骤生成高保真图像,而未考虑下游传输限制。然而,在实际云-端场景中,有限带宽往往需要重度压缩,导致精细纹理丢失与计算资源浪费。为此,我们引入联合端到端训练策略,使扩散模型以可用带宽导出的目标质量等级为条件。在训练过程中,模型学习自适应调节去噪过程,实现提前终止采样,同时保持与目标传输条件相适应的感知质量。本方法仅需少量架构修改,并利用轻量级质量嵌入引导去噪轨迹。实验结果表明,与朴素提前终止方法相比,我们的方法显著提升了带宽自适应生成的视觉保真度,为带宽受限环境下的高效图像传输提供了可行方案。代码开源地址:https://github.com/xzhang9308/BADiff。