Recent works have shown that 3D-aware GANs trained on unstructured single image collections can generate multiview images of novel instances. The key underpinnings to achieve this are a 3D radiance field generator and a volume rendering process. However, existing methods either cannot generate high-resolution images (e.g., up to 256X256) due to the high computation cost of neural volume rendering, or rely on 2D CNNs for image-space upsampling which jeopardizes the 3D consistency across different views. This paper proposes a novel 3D-aware GAN that can generate high resolution images (up to 1024X1024) while keeping strict 3D consistency as in volume rendering. Our motivation is to achieve super-resolution directly in the 3D space to preserve 3D consistency. We avoid the otherwise prohibitively-expensive computation cost by applying 2D convolutions on a set of 2D radiance manifolds defined in the recent generative radiance manifold (GRAM) approach, and apply dedicated loss functions for effective GAN training at high resolution. Experiments on FFHQ and AFHQv2 datasets show that our method can produce high-quality 3D-consistent results that significantly outperform existing methods. It makes a significant step towards closing the gap between traditional 2D image generation and 3D-consistent free-view generation.
翻译:近期研究表明,基于非结构化单图像集训练的三维感知生成对抗网络能够生成新实例的多视角图像。实现这一目标的关键基础是三维辐射场生成器与体渲染过程。然而,现有方法要么因神经体渲染的高计算成本无法生成高分辨率图像(如256×256以上),要么依赖二维CNN进行图像空间上采样导致跨视角三维一致性受损。本文提出一种新型三维感知生成对抗网络,可在保持体渲染严格三维一致性的同时生成高分辨率图像(高达1024×1024)。我们的核心思路是在三维空间中直接实现超分辨率以保持三维一致性。通过将二维卷积应用于近期生成辐射流形方法中定义的二维辐射流形集合,并设计专用损失函数实现高分辨率下有效GAN训练,我们避免了原本难以承受的计算成本。在FFHQ和AFHQv2数据集上的实验表明,本方法可生成高质量三维一致结果,显著优于现有方法,向弥合传统二维图像生成与三维一致自由视角生成之间的鸿沟迈出了重要一步。