Recent advances in 3D face stylization have made significant strides in few to zero-shot settings. However, the degree of stylization achieved by existing methods is often not sufficient for practical applications because they are mostly based on statistical 3D Morphable Models (3DMM) with limited variations. To this end, we propose a method that can produce a highly stylized 3D face model with desired topology. Our methods train a surface deformation network with 3DMM and translate its domain to the target style using a paired exemplar. The network achieves stylization of the 3D face mesh by mimicking the style of the target using a differentiable renderer and directional CLIP losses. Additionally, during the inference process, we utilize a Mesh Agnostic Encoder (MAGE) that takes deformation target, a mesh of diverse topologies as input to the stylization process and encodes its shape into our latent space. The resulting stylized face model can be animated by commonly used 3DMM blend shapes. A set of quantitative and qualitative evaluations demonstrate that our method can produce highly stylized face meshes according to a given style and output them in a desired topology. We also demonstrate example applications of our method including image-based stylized avatar generation, linear interpolation of geometric styles, and facial animation of stylized avatars.
翻译:近期,基于三维人脸风格化技术在少样本乃至零样本设定下取得了显著进展。然而,现有方法实现的风格化程度往往不足以满足实际应用需求,因为它们大多依赖于变体有限的统计三维可变形模型(3DMM)。为此,我们提出一种可生成具有理想拓扑结构的高风格化三维人脸模型的方法。该方法训练了一个基于3DMM的表面变形网络,并通过配对示例将其领域迁移至目标风格。该网络利用可微分渲染器和方向性CLIP损失,通过模仿目标风格实现三维人脸网格的风格化。在推理过程中,我们采用网格无关编码器(MAGE),将不同拓扑结构的变形目标网格作为风格化过程的输入,并将其形状编码至我们的潜在空间。最终生成的风格化人脸模型可通过常用的3DMM混合形状进行动画驱动。定性与定量评估表明,我们的方法能够根据给定风格生成高度风格化的人脸网格,并输出所需拓扑结构。我们还展示了该方法的应用示例,包括基于图像的风格化虚拟形象生成、几何风格线性插值以及风格化虚拟形象的面部动画。