Navigating in the latent space of StyleGAN has shown effectiveness for face editing. However, the resulting methods usually encounter challenges in complicated navigation due to the entanglement among different attributes in the latent space. To address this issue, this paper proposes a novel framework, termed SDFlow, with a semantic decomposition in original latent space using continuous conditional normalizing flows. Specifically, SDFlow decomposes the original latent code into different irrelevant variables by jointly optimizing two components: (i) a semantic encoder to estimate semantic variables from input faces and (ii) a flow-based transformation module to map the latent code into a semantic-irrelevant variable in Gaussian distribution, conditioned on the learned semantic variables. To eliminate the entanglement between variables, we employ a disentangled learning strategy under a mutual information framework, thereby providing precise manipulation controls. Experimental results demonstrate that SDFlow outperforms existing state-of-the-art face editing methods both qualitatively and quantitatively. The source code is made available at https://github.com/phil329/SDFlow.
翻译:在StyleGAN潜在空间中导航已显示出对人脸编辑的有效性。然而,由于潜在空间中不同属性之间的纠缠,现有方法通常面临复杂导航的挑战。为解决这一问题,本文提出一个名为SDFlow的新型框架,通过在原始潜在空间中利用连续条件正则化流进行语义分解。具体而言,SDFlow通过联合优化两个组件将原始潜在编码分解为不同的不相关变量:(i) 一个语义编码器,用于从输入人脸中估计语义变量;(ii) 一个基于流的变换模块,在所学语义变量条件下将潜在编码映射为高斯分布中的语义无关变量。为消除变量间的纠缠,我们采用基于互信息框架的解缠学习策略,从而实现精确的操控控制。实验结果表明,SDFlow在定性和定量上均优于现有最先进的人脸编辑方法。源代码已公开于 https://github.com/phil329/SDFlow。