Scene stylization extends the work of neural style transfer to three spatial dimensions. A vital challenge in this problem is to maintain the uniformity of the stylized appearance across a multi-view setting. A vast majority of the previous works achieve this by optimizing the scene with a specific style image. In contrast, we propose a novel architecture trained on a collection of style images, that at test time produces high quality stylized novel views. Our work builds up on the framework of 3D Gaussian splatting. For a given scene, we take the pretrained Gaussians and process them using a multi resolution hash grid and a tiny MLP to obtain the conditional stylised views. The explicit nature of 3D Gaussians give us inherent advantages over NeRF-based methods including geometric consistency, along with having a fast training and rendering regime. This enables our method to be useful for vast practical use cases such as in augmented or virtual reality applications. Through our experiments, we show our methods achieve state-of-the-art performance with superior visual quality on various indoor and outdoor real-world data.
翻译:场景风格化将神经风格迁移扩展到三维空间。该问题的一个核心挑战是在多视角设置下保持风格化外观的一致性。以往大多数研究通过使用特定风格图像优化场景来实现这一目标。相比之下,我们提出了一种在风格图像集合上训练的新型架构,在测试时能够生成高质量的风格化新视角。我们的工作基于三维高斯泼溅框架。针对给定场景,我们采用预训练的高斯体,通过多分辨率哈希网格与微型MLP进行处理,从而获得条件化的风格化视图。三维高斯体的显式特性使其相比基于NeRF的方法具有内在优势,包括几何一致性以及快速训练和渲染的能力。这使得我们的方法能够广泛应用于增强现实或虚拟现实等实际场景。实验表明,我们的方法在各类室内外真实数据上均实现了最先进的性能与卓越的视觉质量。