Facial style transfer has been quite popular among researchers due to the rise of emerging technologies such as eXtended Reality (XR), Metaverse, and Non-Fungible Tokens (NFTs). Furthermore, StyleGAN methods along with transfer-learning strategies have reduced the problem of limited data to some extent. However, most of the StyleGAN methods overfit the styles while adding artifacts to facial images. In this paper, we propose a facial pose awareness and style transfer (Face-PAST) network that preserves facial details and structures while generating high-quality stylized images. Dual StyleGAN inspires our work, but in contrast, our work uses a pre-trained style generation network in an external style pass with a residual modulation block instead of a transform coding block. Furthermore, we use the gated mapping unit and facial structure, identity, and segmentation losses to preserve the facial structure and details. This enables us to train the network with a very limited amount of data while generating high-quality stylized images. Our training process adapts curriculum learning strategy to perform efficient and flexible style mixing in the generative space. We perform extensive experiments to show the superiority of Face-PAST in comparison to existing state-of-the-art methods.
翻译:面部风格迁移因扩展现实(XR)、元宇宙及非同质化代币(NFT)等新兴技术的兴起而备受研究者关注。同时,StyleGAN方法与迁移学习策略在一定程度上缓解了数据不足的问题。然而,多数StyleGAN方法在生成面部图像时会过度拟合风格并产生伪影。本文提出一种面部姿态感知与风格迁移网络(Face-PAST),能够在生成高质量风格化图像的同时保留面部细节与结构。我们的工作受Dual StyleGAN启发,但与之不同的是,我们采用外部风格通路中的预训练风格生成网络,并搭配残差调制模块替代变换编码模块。此外,我们通过门控映射单元以及面部结构、身份与分割损失函数来保护面部结构与细节。这使得我们能在极少量数据条件下训练网络,同时生成高质量风格化图像。训练过程采用课程学习策略,在生成空间中实现高效灵活的风格混合。通过大量实验,我们证明了Face-PAST相较于现有最优方法的优越性。