The absence of real targets to guide the model training is one of the main problems with the makeup transfer task. Most existing methods tackle this problem by synthesizing pseudo ground truths (PGTs). However, the generated PGTs are often sub-optimal and their imprecision will eventually lead to performance degradation. To alleviate this issue, in this paper, we propose a novel Content-Style Decoupled Makeup Transfer (CSD-MT) method, which works in a purely unsupervised manner and thus eliminates the negative effects of generating PGTs. Specifically, based on the frequency characteristics analysis, we assume that the low-frequency (LF) component of a face image is more associated with its makeup style information, while the high-frequency (HF) component is more related to its content details. This assumption allows CSD-MT to decouple the content and makeup style information in each face image through the frequency decomposition. After that, CSD-MT realizes makeup transfer by maximizing the consistency of these two types of information between the transferred result and input images, respectively. Two newly designed loss functions are also introduced to further improve the transfer performance. Extensive quantitative and qualitative analyses show the effectiveness of our CSD-MT method. Our code is available at https://github.com/Snowfallingplum/CSD-MT.
翻译:缺乏真实目标来指导模型训练是妆容迁移任务面临的主要问题之一。大多数现有方法通过合成伪真值(PGT)来解决此问题。然而,生成的PGT通常并非最优,其不精确性最终会导致性能下降。为缓解此问题,本文提出一种新颖的内容-风格解耦妆容迁移(CSD-MT)方法,该方法以纯无监督方式工作,从而消除了生成PGT的负面影响。具体而言,基于频率特性分析,我们假设人脸图像的低频(LF)分量与其妆容风格信息更相关,而高频(HF)分量则与其内容细节更相关。该假设使得CSD-MT能够通过频率分解解耦每张人脸图像中的内容和妆容风格信息。之后,CSD-MT通过分别最大化迁移结果与输入图像之间这两类信息的一致性来实现妆容迁移。本文还引入了两种新设计的损失函数以进一步提升迁移性能。大量的定量与定性分析证明了我们CSD-MT方法的有效性。我们的代码公开于 https://github.com/Snowfallingplum/CSD-MT。