Facial style transfer has been quite popular among researchers due to the rise of emerging technologies such as eXtended Reality (XR), Metaverse, and Non-Fungible Tokens (NFTs). Furthermore, StyleGAN methods along with transfer-learning strategies have reduced the problem of limited data to some extent. However, most of the StyleGAN methods overfit the styles while adding artifacts to facial images. In this paper, we propose a facial pose awareness and style transfer (Face-PAST) network that preserves facial details and structures while generating high-quality stylized images. Dual StyleGAN inspires our work, but in contrast, our work uses a pre-trained style generation network in an external style pass with a residual modulation block instead of a transform coding block. Furthermore, we use the gated mapping unit and facial structure, identity, and segmentation losses to preserve the facial structure and details. This enables us to train the network with a very limited amount of data while generating high-quality stylized images. Our training process adapts curriculum learning strategy to perform efficient and flexible style mixing in the generative space. We perform extensive experiments to show the superiority of Face-PAST in comparison to existing state-of-the-art methods.
翻译:面部风格迁移因扩展现实(XR)、元宇宙及非同质化代币(NFT)等新兴技术的兴起而在研究人员中广受欢迎。此外,StyleGAN方法结合迁移学习策略在有限数据问题上有一定缓解作用。然而,多数StyleGAN方法在面部图像中生成伪影时存在风格过拟合问题。本文提出一种面部姿态感知与风格迁移网络(Face-PAST),在生成高质量风格化图像的同时保留面部细节与结构。本工作受Dual StyleGAN启发,但不同之处在于,我们采用预训练的样式生成网络作为外部风格传递路径,并使用残差调制模块替代变换编码模块。此外,我们引入门控映射单元以及面部结构、身份和分割损失来保持面部结构与细节。这使得我们能够在极少量数据条件下训练网络并生成高质量风格化图像。训练过程采用课程学习策略,在生成空间内实现高效且灵活的风格混合。通过大量实验,我们验证了Face-PAST相较于现有最先进方法的优越性。