Neural fields have recently enjoyed great success in representing and rendering 3D scenes. However, most state-of-the-art implicit representations model static or dynamic scenes as a whole, with minor variations. Existing work on learning disentangled world and object neural fields do not consider the problem of composing objects into different world neural fields in a lighting-aware manner. We present Lighting-Aware Neural Field (LANe) for the compositional synthesis of driving scenes in a physically consistent manner. Specifically, we learn a scene representation that disentangles the static background and transient elements into a world-NeRF and class-specific object-NeRFs to allow compositional synthesis of multiple objects in the scene. Furthermore, we explicitly designed both the world and object models to handle lighting variation, which allows us to compose objects into scenes with spatially varying lighting. This is achieved by constructing a light field of the scene and using it in conjunction with a learned shader to modulate the appearance of the object NeRFs. We demonstrate the performance of our model on a synthetic dataset of diverse lighting conditions rendered with the CARLA simulator, as well as a novel real-world dataset of cars collected at different times of the day. Our approach shows that it outperforms state-of-the-art compositional scene synthesis on the challenging dataset setup, via composing object-NeRFs learned from one scene into an entirely different scene whilst still respecting the lighting variations in the novel scene. For more results, please visit our project website https://lane-composition.github.io/.
翻译:神经场在三维场景的表示与渲染方面近期取得了显著成功。然而,大多数最先进的隐式表示方法将静态或动态场景作为一个整体进行建模,仅允许微小变化。现有关于解耦世界与物体神经场的研究,未考虑以光照感知方式将物体组合到不同世界神经场中的问题。我们提出光照感知神经场(LANe),用于以物理一致的方式实现驾驶场景的组合合成。具体而言,我们学习一种场景表示,将静态背景与瞬态元素解耦为世界NeRF和类别特定物体NeRF,从而支持场景中多个物体的组合合成。此外,我们显式设计了世界模型与物体模型以处理光照变化,使得我们能够将物体组合到具有空间变化光照的场景中。这通过构建场景的光场,并将其与学习到的着色器结合使用来调节物体NeRF的外观来实现。我们在使用CARLA模拟器渲染的包含多样光照条件的合成数据集上,以及在不同时间段收集的新型真实世界汽车数据集上验证了模型性能。实验表明,我们的方法在具有挑战性的数据集设置上优于最先进的组合场景合成方法——通过将一个场景中学习到的物体NeRF组合到完全不同的另一个场景中,同时保留新场景中的光照变化。更多结果请访问项目网站:https://lane-composition.github.io/。