Existing methods for 3D-aware image synthesis largely depend on the 3D pose distribution pre-estimated on the training set. An inaccurate estimation may mislead the model into learning faulty geometry. This work proposes PoF3D that frees generative radiance fields from the requirements of 3D pose priors. We first equip the generator with an efficient pose learner, which is able to infer a pose from a latent code, to approximate the underlying true pose distribution automatically. We then assign the discriminator a task to learn pose distribution under the supervision of the generator and to differentiate real and synthesized images with the predicted pose as the condition. The pose-free generator and the pose-aware discriminator are jointly trained in an adversarial manner. Extensive results on a couple of datasets confirm that the performance of our approach, regarding both image quality and geometry quality, is on par with state of the art. To our best knowledge, PoF3D demonstrates the feasibility of learning high-quality 3D-aware image synthesis without using 3D pose priors for the first time.
翻译:现有3D感知图像合成方法高度依赖于训练集上预先估计的3D姿态分布。不准确的估计可能导致模型学习到错误的几何结构。本工作提出PoF3D,将生成辐射场从对3D姿态先验的需求中解放出来。我们首先为生成器配备高效姿态学习器,使其能够从潜在编码中推断出姿态,从而自动逼近潜在的真实姿态分布。随后我们为判别器分配一项任务——在生成器的监督下学习姿态分布,并以预测姿态为条件区分真实图像与合成图像。无姿态先验的生成器与姿态感知的判别器以对抗方式联合训练。在多个数据集上的大量结果表明,我们方法在图像质量与几何质量方面均与当前最优方法相当。据我们所知,PoF3D首次证明了在不使用3D姿态先验的条件下实现高质量3D感知图像合成的可行性。