With the surge in emerging technologies such as Metaverse, spatial computing, and generative AI, the application of facial style transfer has gained a lot of interest from researchers as well as startups enthusiasts alike. StyleGAN methods have paved the way for transfer-learning strategies that could reduce the dependency on the huge volume of data that is available for the training process. However, StyleGAN methods have the tendency of overfitting that results in the introduction of artifacts in the facial images. Studies, such as DualStyleGAN, proposed the use of multipath networks but they require the networks to be trained for a specific style rather than generating a fusion of facial styles at once. In this paper, we propose a FusIon of STyles (FIST) network for facial images that leverages pre-trained multipath style transfer networks to eliminate the problem associated with lack of huge data volume in the training phase along with the fusion of multiple styles at the output. We leverage pre-trained styleGAN networks with an external style pass that use residual modulation block instead of a transform coding block. The method also preserves facial structure, identity, and details via the gated mapping unit introduced in this study. The aforementioned components enable us to train the network with very limited amount of data while generating high-quality stylized images. Our training process adapts curriculum learning strategy to perform efficient, flexible style and model fusion in the generative space. We perform extensive experiments to show the superiority of FISTNet in comparison to existing state-of-the-art methods.
翻译:随着元宇宙、空间计算和生成式AI等新兴技术的蓬勃发展,面部风格迁移应用引起了研究人员和创业爱好者的广泛关注。StyleGAN方法为迁移学习策略铺平了道路,这类策略可降低训练过程对海量数据的依赖。然而,StyleGAN方法存在过拟合倾向,易在面部图像中引入伪影。DualStyleGAN等研究提出了多路径网络方案,但这类网络需要针对特定风格进行训练,无法一次性生成多种面部风格的融合结果。本文提出一种面部图像风格融合网络(FISTNet),利用预训练的多路径风格迁移网络,既解决了训练阶段缺乏海量数据的问题,又能输出多风格融合结果。我们采用预训练的StyleGAN网络,并引入外部风格通路——该通路使用残差调制模块替代变换编码模块。此外,本文提出的门控映射单元有效保留了面部结构、身份特征与细节信息。上述组件使得网络可在极少量数据条件下完成训练,同时生成高质量风格化图像。我们采用课程学习策略,在生成空间内实现高效灵活的样式与模型融合。通过大量实验证明,FISTNet相比现有最优方法具有显著优势。