Hairstyle transfer is a challenging task in the image editing field that modifies the hairstyle of a given face image while preserving its other appearance and background features. The existing hairstyle transfer approaches heavily rely on StyleGAN, which is pre-trained on cropped and aligned face images. Hence, they struggle to generalize under challenging conditions such as extreme variations of head poses or focal lengths. To address this issue, we propose a one-stage hairstyle transfer diffusion model, HairFusion, that applies to real-world scenarios. Specifically, we carefully design a hair-agnostic representation as the input of the model, where the original hair information is thoroughly eliminated. Next, we introduce a hair align cross-attention (Align-CA) to accurately align the reference hairstyle with the face image while considering the difference in their head poses. To enhance the preservation of the face image's original features, we leverage adaptive hair blending during the inference, where the output's hair regions are estimated by the cross-attention map in Align-CA and blended with non-hair areas of the face image. Our experimental results show that our method achieves state-of-the-art performance compared to the existing methods in preserving the integrity of both the transferred hairstyle and the surrounding features. The codes are available at https://github.com/cychungg/HairFusion
翻译:发型迁移是图像编辑领域中的一项具有挑战性的任务,它旨在修改给定人脸图像的发型,同时保留其其他外观和背景特征。现有的发型迁移方法严重依赖于在裁剪和对齐的人脸图像上预训练的StyleGAN。因此,它们在具有挑战性的条件下(例如头部姿态或焦距的极端变化)难以泛化。为了解决这个问题,我们提出了一个适用于真实场景的单阶段发型迁移扩散模型,HairFusion。具体而言,我们精心设计了一种与头发无关的表示作为模型的输入,其中原始头发信息被彻底消除。接着,我们引入了头发对齐交叉注意力机制(Align-CA),在考虑参考发型与目标人脸图像之间头部姿态差异的同时,精确地将参考发型与目标人脸对齐。为了增强对原始人脸图像特征的保留,我们在推理过程中利用自适应头发融合技术,其中输出的头发区域由Align-CA中的交叉注意力图估计,并与人脸图像的非头发区域进行融合。我们的实验结果表明,与现有方法相比,我们的方法在保持迁移发型的完整性以及周围特征的完整性方面达到了最先进的性能。代码可在 https://github.com/cychungg/HairFusion 获取。