Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors. Recently, the integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs), has transformed 3D-aware generation from single-view images. NeRF-GANs exploit the strong inductive bias of 3D neural representations and volumetric rendering at the cost of higher computational complexity. This study aims at revisiting pose-conditioned 2D GANs for efficient 3D-aware generation at inference time by distilling 3D knowledge from pretrained NeRF-GANS. We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations. Experiments on several datasets demonstrate that the proposed method obtains results comparable with volumetric rendering in terms of quality and 3D consistency while benefiting from the superior computational advantage of convolutional networks. The code will be available at: https://github.com/mshahbazi72/NeRF-GAN-Distillation
翻译:姿态条件卷积生成模型由于缺乏足够的三维先验知识,在从单视角数据集生成高质量三维一致图像方面存在困难。近年来,神经辐射场(NeRF)与生成对抗网络(GAN)等生成模型的融合,彻底改变了从单视角图像进行三维感知生成的方式。NeRF-GAN利用三维神经表征的强归纳偏置和体渲染技术,但代价是计算复杂度较高。本研究旨在通过从预训练的NeRF-GAN中蒸馏三维知识,重新审视姿态条件二维GAN在推理时实现高效三维感知生成的方法。我们提出一种简单有效的方案,基于在姿态条件卷积网络中重用预训练NeRF-GAN的高度解耦潜空间,直接生成与底层三维表征相对应的三维一致图像。多个数据集上的实验表明,所提方法在质量和三维一致性方面取得了与体渲染相当的结果,同时继承了卷积网络的优越计算效率。代码开源地址:https://github.com/mshahbazi72/NeRF-GAN-Distillation