3D GAN inversion aims to project a single image into the latent space of a 3D Generative Adversarial Network (GAN), thereby achieving 3D geometry reconstruction. While there exist encoders that achieve good results in 3D GAN inversion, they are predominantly built on EG3D, which specializes in synthesizing near-frontal views and is limiting in synthesizing comprehensive 3D scenes from diverse viewpoints. In contrast to existing approaches, we propose a novel framework built on PanoHead, which excels in synthesizing images from a 360-degree perspective. To achieve realistic 3D modeling of the input image, we introduce a dual encoder system tailored for high-fidelity reconstruction and realistic generation from different viewpoints. Accompanying this, we propose a stitching framework on the triplane domain to get the best predictions from both. To achieve seamless stitching, both encoders must output consistent results despite being specialized for different tasks. For this reason, we carefully train these encoders using specialized losses, including an adversarial loss based on our novel occlusion-aware triplane discriminator. Experiments reveal that our approach surpasses the existing encoder training methods qualitatively and quantitatively. Please visit the project page: https://berkegokmen1.github.io/dual-enc-3d-gan-inv.
翻译:三维GAN反演旨在将单张图像投影到三维生成对抗网络的潜在空间中,从而实现三维几何重建。尽管现有编码器在三维GAN反演中取得了良好效果,但它们主要基于EG3D构建,而EG3D专精于合成近正面视角图像,在从多样化视点合成完整三维场景方面存在局限。与现有方法不同,我们提出了一种基于PanoHead构建的新型框架,该框架擅长从360度全景视角合成图像。为实现输入图像的真实三维建模,我们引入了专为高保真重建和多样化视点真实生成而设计的双编码器系统。配合此系统,我们提出了三平面域上的拼接框架,以获取两者的最优预测结果。为实现无缝拼接,两个编码器尽管专攻不同任务,但必须输出一致的结果。为此,我们采用专门设计的损失函数精心训练这些编码器,包括基于我们提出的新型遮挡感知三平面判别器的对抗损失。实验表明,我们的方法在定性和定量评估上均超越了现有编码器训练方法。项目页面请访问:https://berkegokmen1.github.io/dual-enc-3d-gan-inv。