While recent 3D-aware generative models have shown photo-realistic image synthesis with multi-view consistency, the synthesized image quality degrades depending on the camera pose (e.g., a face with a blurry and noisy boundary at a side viewpoint). Such degradation is mainly caused by the difficulty of learning both pose consistency and photo-realism simultaneously from a dataset with heavily imbalanced poses. In this paper, we propose SideGAN, a novel 3D GAN training method to generate photo-realistic images irrespective of the camera pose, especially for faces of side-view angles. To ease the challenging problem of learning photo-realistic and pose-consistent image synthesis, we split the problem into two subproblems, each of which can be solved more easily. Specifically, we formulate the problem as a combination of two simple discrimination problems, one of which learns to discriminate whether a synthesized image looks real or not, and the other learns to discriminate whether a synthesized image agrees with the camera pose. Based on this, we propose a dual-branched discriminator with two discrimination branches. We also propose a pose-matching loss to learn the pose consistency of 3D GANs. In addition, we present a pose sampling strategy to increase learning opportunities for steep angles in a pose-imbalanced dataset. With extensive validation, we demonstrate that our approach enables 3D GANs to generate high-quality geometries and photo-realistic images irrespective of the camera pose.
翻译:虽然近期三维感知生成模型已实现具有多视角一致性的照片级图像合成,但其合成图像质量会因相机姿态(例如侧视角度的人脸存在模糊且带噪声的边界)而退化。这种退化主要源于从姿态严重不平衡的数据集中同时学习姿态一致性与照片真实感的困难。本文提出SideGAN——一种新型三维生成对抗网络训练方法,能够生成与相机姿态无关的照片级图像,特别聚焦于侧视角人脸的合成。为缓解学习照片真实感与姿态一致性图像合成的难题,我们将该问题分解为两个更易求解的子问题。具体而言,我们将问题形式化为两个简单判别任务的组合:第一个任务学习判别合成图像是否真实,第二个任务学习判别合成图像是否与相机姿态一致。基于此,我们提出具有双判别分支的双支路判别器,并引入姿态匹配损失以学习三维生成对抗网络的姿态一致性。此外,我们提出姿态采样策略,以增加在姿态不平衡数据集中对大角度姿态的学习机会。通过广泛验证,我们证明该方法能使三维生成对抗网络生成与相机姿态无关的高质量几何结构与照片级图像。