Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors. Recently, the integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs), has transformed 3D-aware generation from single-view images. NeRF-GANs exploit the strong inductive bias of neural 3D representations and volumetric rendering at the cost of higher computational complexity. This study aims at revisiting pose-conditioned 2D GANs for efficient 3D-aware generation at inference time by distilling 3D knowledge from pretrained NeRF-GANs. We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations. Experiments on several datasets demonstrate that the proposed method obtains results comparable with volumetric rendering in terms of quality and 3D consistency while benefiting from the computational advantage of convolutional networks. The code will be available at: https://github.com/mshahbazi72/NeRF-GAN-Distillation
翻译:姿态条件卷积生成模型因缺乏足够的三维先验知识,在从单视角数据集生成高质量的三维一致图像时存在困难。近年来,神经辐射场(NeRF)与生成对抗网络(GAN)等生成模型的结合,彻底改变了基于单视角图像的三维感知生成。NeRF-GAN利用神经三维表示和体渲染的强归纳偏置,但以更高的计算复杂度为代价。本研究旨在通过从预训练的NeRF-GAN中蒸馏三维知识,在推理阶段实现高效的姿态条件二维GAN三维感知生成。我们提出一种简单有效的方法,基于在姿态条件卷积网络中复用预训练NeRF-GAN良好解耦的潜在空间,直接生成与底层三维表示对应的三维一致图像。多个数据集上的实验表明,所提方法在质量和三维一致性方面取得了与体渲染相当的结果,同时继承了卷积网络的计算优势。代码将开源在:https://github.com/mshahbazi72/NeRF-GAN-Distillation