Most digital videos are stored in 8-bit low dynamic range (LDR) formats, where much of the original high dynamic range (HDR) scene radiance is lost due to saturation and quantization. This loss of highlight and shadow detail precludes mapping accurate luminance to HDR displays and limits meaningful re-exposure in post-production workflows. Although techniques have been proposed to convert LDR images to HDR through dynamic range expansion, they struggle to restore realistic detail in the over- and underexposed regions. To address this, we present DiffHDR, a framework that formulates LDR-to-HDR conversion as a generative radiance inpainting task within the latent space of a video diffusion model. By operating in Log-Gamma color space, DiffHDR leverages spatio-temporal generative priors from a pretrained video diffusion model to synthesize plausible HDR radiance in over- and underexposed regions while recovering the continuous scene radiance of the quantized pixels. Our framework further enables controllable LDR-to-HDR video conversion guided by text prompts or reference images. To address the scarcity of paired HDR video data, we develop a pipeline that synthesizes high-quality HDR video training data from static HDRI maps. Extensive experiments demonstrate that DiffHDR significantly outperforms state-of-the-art approaches in radiance fidelity and temporal stability, producing realistic HDR videos with considerable latitude for re-exposure.
翻译:大多数数字视频以8位低动态范围(LDR)格式存储,其中原始高动态范围(HDR)场景辐射度因饱和与量化而大量丢失。高光与阴影细节的缺失阻碍了向HDR显示器精确映射亮度,并限制了后期制作流程中有意义的重新曝光。尽管已有技术通过动态范围扩展将LDR图像转换为HDR,但在恢复过曝和欠曝区域的真实细节方面仍有不足。为解决此问题,我们提出DiffHDR框架,将LDR到HDR的转换公式化为视频扩散模型潜在空间中的生成式辐射度修复任务。通过在Log-Gamma色彩空间中运行,DiffHDR利用预训练视频扩散模型的时空生成先验,在过曝和欠曝区域合成合理的HDR辐射度,同时恢复量化像素的连续场景辐射度。该框架进一步支持通过文本提示或参考图像引导的可控LDR到HDR视频转换。针对配对HDR视频数据的稀缺性,我们开发了一种从静态HDRI贴图合成高质量HDR视频训练数据的流水线。大量实验表明,DiffHDR在辐射度保真度与时间稳定性方面显著优于现有方法,生成的逼真HDR视频具备充分的重新曝光宽容度。