Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors. Recently, the integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs), has transformed 3D-aware generation from single-view images. NeRF-GANs exploit the strong inductive bias of 3D neural representations and volumetric rendering at the cost of higher computational complexity. This study aims at revisiting pose-conditioned 2D GANs for memory-efficient 3D-aware generation at inference time by distilling 3D knowledge from pretrained NeRF-GANS. We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations. Experiments on several datasets demonstrate that the proposed method obtains results comparable with volumetric rendering in terms of quality and 3D consistency while benefiting from the superior computational advantage of convolutional networks. The code will be available at: https://github.com/mshahbazi72/NeRF-GAN-Distillation
翻译:基于姿态条件的卷积生成模型在处理单视图数据集的3D一致图像生成时,因缺乏充分的三维先验知识而难以获得高质量结果。近年来,神经辐射场(NeRF)与生成对抗网络(GAN)等生成模型的结合,彻底改变了基于单视图图像的三维感知生成范式。NeRF-GAN通过利用三维神经表征和体渲染的强归纳偏置,以更高的计算复杂度为代价实现了三维感知生成。本研究旨在通过从预训练NeRF-GAN中蒸馏三维知识,在推理阶段重新审视基于姿态条件的二维GAN以实现内存高效的三维感知生成。我们提出了一种简洁有效的方法,在姿态条件卷积网络中复用预训练NeRF-GAN的优良解耦潜空间,直接生成与底层三维表征相对应的3D一致图像。多个数据集上的实验表明,本方法在图像质量和三维一致性方面可取得与体渲染相当的结果,同时兼具卷积网络的显著计算优势。代码将发布于:https://github.com/mshahbazi72/NeRF-GAN-Distillation