Face swapping has gained significant traction, driven by the plethora of human face synthesis facilitated by deep learning methods. However, previous face swapping methods that used generative adversarial networks (GANs) as backbones have faced challenges such as inconsistency in blending, distortions, artifacts, and issues with training stability. To address these limitations, we propose an innovative end-to-end framework for high-fidelity face swapping. First, we introduce a StyleGAN-based facial attributes encoder that extracts essential features from faces and inverts them into a latent style code, encapsulating indispensable facial attributes for successful face swapping. Second, we introduce an attention-based style blending module to effectively transfer Face IDs from source to target. To ensure accurate and quality transferring, a series of constraint measures including contrastive face ID learning, facial landmark alignment, and dual swap consistency is implemented. Finally, the blended style code is translated back to the image space via the style decoder, which is of high training stability and generative capability. Extensive experiments on the CelebA-HQ dataset highlight the superior visual quality of generated images from our face-swapping methodology when compared to other state-of-the-art methods, and the effectiveness of each proposed module. Source code and weights will be publicly available.
翻译:人脸交换技术因深度学习方法在人类面部合成领域的广泛应用而备受关注。然而,先前以生成对抗网络(GANs)为骨干的人脸交换方法面临融合不一致、变形、伪影以及训练稳定性问题等挑战。为克服这些局限,我们提出了一种创新的端到端高保真人脸交换框架。首先,我们引入基于StyleGAN的面部属性编码器,该编码器从人脸中提取关键特征并将其转换为潜在风格编码,从而封装成功进行人脸交换所必需的面部属性。其次,我们提出基于注意力的风格融合模块,以有效将人脸ID从源图像迁移至目标图像。为确保迁移的准确性和质量,我们实施了一系列约束措施,包括对比人脸ID学习、面部标志点对齐以及双向交换一致性。最后,通过训练稳定性高且生成能力强的风格解码器,将融合后的风格编码转换回图像空间。在CelebA-HQ数据集上进行的大量实验表明,与其他最先进方法相比,我们的人脸交换方法所生成图像具有更优的视觉质量,同时各模块的有效性也得到验证。源代码和模型权重将公开提供。