DiffusionLight-Turbo: Accelerated Light Probes for Free via Single-Pass Chrome Ball Inpainting

We introduce a simple yet effective technique for estimating lighting from a single low-dynamic-range (LDR) image by reframing the task as a chrome ball inpainting problem. This approach leverages a pre-trained diffusion model, Stable Diffusion XL, to overcome the generalization failures of existing methods that rely on limited HDR panorama datasets. While conceptually simple, the task remains challenging because diffusion models often insert incorrect or inconsistent content and cannot readily generate chrome balls in HDR format. Our analysis reveals that the inpainting process is highly sensitive to the initial noise in the diffusion process, occasionally resulting in unrealistic outputs. To address this, we first introduce DiffusionLight, which uses iterative inpainting to compute a median chrome ball from multiple outputs to serve as a stable, low-frequency lighting prior that guides the generation of a high-quality final result. To generate high-dynamic-range (HDR) light probes, an Exposure LoRA is fine-tuned to create LDR images at multiple exposure values, which are then merged. While effective, DiffusionLight is time-intensive, requiring approximately 30 minutes per estimation. To reduce this overhead, we introduce DiffusionLight-Turbo, which reduces the runtime to about 30 seconds with minimal quality loss. This 60x speedup is achieved by training a Turbo LoRA to directly predict the averaged chrome balls from the iterative process. Inference is further streamlined into a single denoising pass using a LoRA swapping technique. Experimental results that show our method produces convincing light estimates across diverse settings and demonstrates superior generalization to in-the-wild scenarios. Our code is available at https://diffusionlight.github.io/turbo

翻译：本文提出了一种简单而有效的技术，通过将任务重新定义为镀铬球修复问题，从单张低动态范围（LDR）图像中估计光照。该方法利用预训练的扩散模型Stable Diffusion XL，克服了现有方法因依赖有限的高动态范围（HDR）全景数据集而产生的泛化失败问题。尽管概念简单，该任务仍具挑战性，因为扩散模型常会插入错误或不一致的内容，且难以直接生成HDR格式的镀铬球。我们的分析表明，修复过程对扩散过程中的初始噪声高度敏感，偶尔会产生不真实的输出。为解决此问题，我们首先提出了DiffusionLight，它使用迭代修复从多个输出中计算中值镀铬球，以作为稳定、低频的光照先验，指导生成高质量最终结果。为生成高动态范围（HDR）光照探针，我们微调了一个Exposure LoRA来创建多个曝光值下的LDR图像，随后将其合并。尽管有效，DiffusionLight耗时较长，每次估计约需30分钟。为降低此开销，我们引入了DiffusionLight-Turbo，将运行时间缩短至约30秒，且质量损失最小。这一60倍的加速是通过训练一个Turbo LoRA直接预测迭代过程中的平均镀铬球实现的。利用LoRA交换技术，推理过程进一步简化为单次去噪步骤。实验结果表明，我们的方法能在多样场景中生成可信的光照估计，并展现出对真实场景的卓越泛化能力。代码发布于https://diffusionlight.github.io/turbo。