Inverse rendering of urban scenes from captured videos enables numerous applications, including content creation and autonomous driving simulation. Physically-based rendering methods follow and control lighting physics, but suffer from reconstruction and rendering artifacts. While generative models produce realistic videos, they offer limited consistency and controllability. We present BRDFusion, a unified framework that combines two complementary models for inverse and forward rendering. Specifically, BRDFusion recovers explicit, consistent scene properties with physical modeling and alleviates optimization ambiguity with generative priors. During forward rendering, the physical model provides controllable rendering from the scene configuration, and the generative model denoises and fixes artifacts. Therefore, our method produces high-quality videos while allowing precise control, outperforming baselines in real and synthetic scenes. Moreover, BRDFusion supports novel-view relighting, night simulation, and dynamic object insertion/editing. Project page: https://shigon255.github.io/brdfusion-page/
翻译:从拍摄视频中对城市场景进行逆渲染可实现多项应用,包括内容创建和自动驾驶仿真。基于物理的渲染方法遵循并控制光照物理规律,但存在重建与渲染伪影问题。生成模型虽然能产生逼真的视频,但其一致性和可控性有限。我们提出BRDFusion,一种统一框架,将逆渲染与前向渲染两种互补模型相结合。具体而言,BRDFusion通过物理建模恢复显式且一致的场景属性,并利用生成先验缓解优化模糊性。在前向渲染过程中,物理模型基于场景配置提供可控渲染,生成模型则对结果进行去噪和伪影修正。因此,我们的方法能在实现精确控制的同时生成高质量视频,在真实与合成场景中均优于基线方法。此外,BRDFusion支持新视角重光照、夜间仿真以及动态物体插入/编辑。项目页面:https://shigon255.github.io/brdfusion-page/