Generative Adversarial Networks (GANs) are capable of synthesizing high-quality facial images. Despite their success, GANs do not provide any information about the relationship between the input vectors and the generated images. Currently, facial GANs are trained on imbalanced datasets, which generate less diverse images. For example, more than 77% of 100K images that we randomly synthesized using the StyleGAN3 are classified as Happy, and only around 3% are Angry. The problem even becomes worse when a mixture of facial attributes is desired: less than 1% of the generated samples are Angry Woman, and only around 2% are Happy Black. To address these problems, this paper proposes a framework, called GANalyzer, for the analysis, and manipulation of the latent space of well-trained GANs. GANalyzer consists of a set of transformation functions designed to manipulate latent vectors for a specific facial attribute such as facial Expression, Age, Gender, and Race. We analyze facial attribute entanglement in the latent space of GANs and apply the proposed transformation for editing the disentangled facial attributes. Our experimental results demonstrate the strength of GANalyzer in editing facial attributes and generating any desired faces. We also create and release a balanced photo-realistic human face dataset. Our code is publicly available on GitHub.
翻译:生成对抗网络(GANs)虽能合成高质量人脸图像,但其输入向量与生成图像之间的关联机制尚不明确。现有面部GANs多基于不平衡数据集训练,导致生成图像多样性不足。例如,我们利用StyleGAN3随机合成的10万张图像中,超过77%被归类为“快乐表情”,而仅约3%为“愤怒表情”。当需要混合面部属性时问题更为严峻:生成的“愤怒女性”样本不足1%,“快乐黑人”样本仅约2%。针对上述问题,本文提出名为GANalyzer的框架,用于分析并操控预训练GANs的潜空间。该框架包含一组变换函数,可针对面部表情、年龄、性别、种族等特定属性操控潜向量。我们通过分析GANs潜空间中的面部属性纠缠现象,应用所提变换实现解耦属性的编辑。实验结果表明,GANalyzer在编辑面部属性及生成任意目标人脸方面具有显著优势。我们同时构建并发布了平衡度高的逼真人脸数据集,相关代码已在GitHub公开。