Although 3D-aware GANs based on neural radiance fields have achieved competitive performance, their applicability is still limited to objects or scenes with the ground-truths or prediction models for clearly defined canonical camera poses. To extend the scope of applicable datasets, we propose a novel 3D-aware GAN optimization technique through contrastive learning with implicit pose embeddings. To this end, we first revise the discriminator design and remove dependency on ground-truth camera poses. Then, to capture complex and challenging 3D scene structures more effectively, we make the discriminator estimate a high-dimensional implicit pose embedding from a given image and perform contrastive learning on the pose embedding. The proposed approach can be employed for the dataset, where the canonical camera pose is ill-defined because it does not look up or estimate camera poses. Experimental results show that our algorithm outperforms existing methods by large margins on the datasets with multiple object categories and inconsistent canonical camera poses.
翻译:尽管基于神经辐射场的3D感知GAN已取得显著性能,其应用仍局限于具备明确定义规范相机姿态的真值或预测模型的对象或场景。为拓展适用数据集的范围,我们提出一种通过隐式姿态嵌入对比学习的新型3D感知GAN优化技术。为此,我们首先改进判别器设计,消除对真值相机姿态的依赖;其次,为更有效捕获复杂且具挑战性的3D场景结构,使判别器从给定图像中估计高维隐式姿态嵌入,并对其执行对比学习。本方法可应用于规范相机姿态因无法查找或估计而定义模糊的数据集。实验结果表明,我们的算法在包含多类别对象及不一致规范相机姿态的数据集上,以显著优势超越现有方法。