We introduce LumiNet, a novel architecture that leverages generative models and latent intrinsic representations for effective lighting transfer. Given a source image and a target lighting image, LumiNet synthesizes a relit version of the source scene that captures the target's lighting. Our approach makes two key contributions: a data curation strategy from the StyleGAN-based relighting model for our training, and a modified diffusion-based ControlNet that processes both latent intrinsic properties from the source image and latent extrinsic properties from the target image. We further improve lighting transfer through a learned adaptor (MLP) that injects the target's latent extrinsic properties via cross-attention and fine-tuning. Unlike traditional ControlNet, which generates images with conditional maps from a single scene, LumiNet processes latent representations from two different images - preserving geometry and albedo from the source while transferring lighting characteristics from the target. Experiments demonstrate that our method successfully transfers complex lighting phenomena including specular highlights and indirect illumination across scenes with varying spatial layouts and materials, outperforming existing approaches on challenging indoor scenes using only images as input.
翻译:我们提出LumiNet,一种利用生成模型和潜在本征表示实现高效光照迁移的新型架构。给定源图像和目标光照图像,LumiNet能够合成源场景的重光照版本,准确捕捉目标光照特性。本方法包含两项核心贡献:首先提出一种基于StyleGAN重光照模型的数据构建策略用于训练;其次设计了一种改进的基于ControlNet的扩散模型,该模型同时处理源图像的潜在本征属性与目标图像的潜在外征属性。我们进一步通过可学习的适配器(MLP)增强光照迁移效果,该适配器通过交叉注意力机制和微调操作注入目标的潜在外征属性。与传统ControlNet仅基于单场景条件图生成图像不同,LumiNet能够处理来自两个不同图像的潜在表示——在保留源场景几何与反照率的同时,迁移目标场景的光照特征。实验表明,我们的方法能够成功迁移包含镜面高光和间接光照在内的复杂光照现象,适用于不同空间布局和材质的场景,在仅使用图像输入的室内场景挑战性任务中超越了现有方法。