Full-image relighting remains a challenging problem due to the difficulty of collecting large-scale structured paired data, the difficulty of maintaining physical plausibility, and the limited generalizability imposed by data-driven priors. Existing attempts to bridge the synthetic-to-real gap for full-scene relighting remain suboptimal. To tackle these challenges, we introduce Physics-Inspired diffusion for full-image reLight ($π$-Light, or PI-Light), a two-stage framework that leverages physics-inspired diffusion models. Our design incorporates (i) batch-aware attention, which improves the consistency of intrinsic predictions across a collection of images, (ii) a physics-guided neural rendering module that enforces physically plausible light transport, (iii) physics-inspired losses that regularize training dynamics toward a physically meaningful landscape, thereby enhancing generalizability to real-world image editing, and (iv) a carefully curated dataset of diverse objects and scenes captured under controlled lighting conditions. Together, these components enable efficient finetuning of pretrained diffusion models while also providing a solid benchmark for downstream evaluation. Experiments demonstrate that $π$-Light synthesizes specular highlights and diffuse reflections across a wide variety of materials, achieving superior generalization to real-world scenes compared with prior approaches.
翻译:全图像重光照仍然是一个具有挑战性的问题,其原因在于大规模结构化配对数据收集困难、物理合理性难以保持,以及数据驱动先验所施加的泛化能力有限。现有尝试弥合合成到真实场景全场景重光照差距的方法仍不理想。为应对这些挑战,我们提出了基于物理启发的全图像重光照扩散模型($π$-Light,或 PI-Light),这是一个利用物理启发扩散模型的两阶段框架。我们的设计融合了以下组件:(i)批次感知注意力机制,它提升了图像集合中本征预测的一致性;(ii)物理引导的神经渲染模块,用于强制实现物理上合理的光线传输;(iii)物理启发的损失函数,将训练动态正则化至物理意义明确的优化空间,从而增强对真实世界图像编辑的泛化能力;以及(iv)一个精心策划的数据集,包含在受控光照条件下捕获的多样化物体和场景。这些组件共同实现了对预训练扩散模型的高效微调,同时为下游评估提供了坚实的基准。实验表明,与现有方法相比,$π$-Light 能够在多种材质上合成镜面高光和漫反射,并在真实场景中实现了更优的泛化性能。