Automatic 3D facial texture generation has gained significant interest recently. Existing approaches may not support the traditional physically based rendering pipeline or rely on 3D data captured by Light Stage. Our key contribution is a progressive latent space refinement approach that can bootstrap from 3D Morphable Models (3DMMs)-based texture maps generated from facial images to generate high-quality and diverse PBR textures, including albedo, normal, and roughness. It starts with enhancing Generative Adversarial Networks (GANs) for text-guided and diverse texture generation. To this end, we design a self-supervised paradigm to overcome the reliance on ground truth 3D textures and train the generative model with only entangled texture maps. Besides, we foster mutual enhancement between GANs and Score Distillation Sampling (SDS). SDS boosts GANs with more generative modes, while GANs promote more efficient optimization of SDS. Furthermore, we introduce an edge-aware SDS for multi-view consistent facial structure. Experiments demonstrate that our method outperforms existing 3D texture generation methods regarding photo-realistic quality, diversity, and efficiency.
翻译:近年来,自动三维面部纹理生成技术引起了广泛关注。现有方法要么无法支持传统的基于物理的渲染管线,要么依赖光场设备捕获的三维数据。本文的关键贡献在于提出一种渐进式潜在空间细化方法,该方法能够从基于三维形变模型(3DMMs)的面部图像纹理映射出发,生成包括反照率、法线和粗糙度在内的高质量且多样化的PBR纹理。我们首先增强生成对抗网络(GANs),以实现文本引导的多样化纹理生成。为此,我们设计了一种自监督范式来克服对真实三维纹理的依赖,仅通过纠缠纹理映射训练生成模型。此外,我们促进GANs与分数蒸馏采样(SDS)之间的相互增强:SDS为GANs提供更多生成模式,而GANs则推动SDS实现更高效的优化。更进一步,我们引入了一种边缘感知的SDS方法,以保持多视角一致的面部结构。实验结果表明,本方法在逼真度、多样性和效率方面均优于现有的三维纹理生成方法。