We introduce an industrial Head Blending pipeline for the task of seamlessly integrating an actor's head onto a target body in digital content creation. The key challenge stems from discrepancies in head shape and hair structure, which lead to unnatural boundaries and blending artifacts. Existing methods treat foreground and background as a single task, resulting in suboptimal blending quality. To address this problem, we propose CHANGER, a novel pipeline that decouples background integration from foreground blending. By utilizing chroma keying for artifact-free background generation and introducing Head shape and long Hair augmentation ($H^2$ augmentation) to simulate a wide range of head shapes and hair styles, CHANGER improves generalization on innumerable various real-world cases. Furthermore, our Foreground Predictive Attention Transformer (FPAT) module enhances foreground blending by predicting and focusing on key head and body regions. Quantitative and qualitative evaluations on benchmark datasets demonstrate that our CHANGER outperforms state-of-the-art methods, delivering high-fidelity, industrial-grade results.
翻译:本文提出了一种工业级头部融合流程,旨在数字内容创作中实现演员头部与目标身体的无缝融合。核心挑战源于头部形状与头发结构的差异,这些差异会导致不自然的边界与融合伪影。现有方法将前景与背景处理为单一任务,导致融合质量欠佳。为解决此问题,我们提出了CHANGER——一种将背景融合与前景融合解耦的新型流程。该流程通过采用色键技术生成无伪影的背景,并引入头部形状与长发增强($H^2$增强)来模拟广泛的头部形状与发型,从而提升了在无数真实场景下的泛化能力。此外,我们的前景预测注意力Transformer(FPAT)模块通过预测并聚焦于关键的头部与身体区域,增强了前景融合效果。在基准数据集上的定量与定性评估表明,我们的CHANGER方法优于现有先进技术,能够提供高保真、工业级的结果。