Recent generative image editing methods adopt layered representations to mitigate the entangled nature of raster images and improve controllability, typically relying on object-based segmentation. However, such strategies may fail to capture the structural and stylized properties of human-created images, such as anime illustrations. To solve this issue, we propose a workflow-aware structured layer decomposition framework tailored to the illustration production of anime artwork. Inspired by the creation pipeline of anime production, our method decomposes the illustration into semantically meaningful production layers, including line art, flat color, shadow, and highlight. To decouple all these layers, we introduce lightweight layer semantic embeddings to provide specific task guidance for each layer. Furthermore, a set of layer-wise losses is incorporated to supervise the training process of individual layers. To overcome the lack of ground-truth layered data, we construct a high-quality illustration dataset that simulated the standard anime production workflow. Experiments demonstrate that the accurate and visually coherent layer decompositions were achieved by using our method. We believe that the resulting layered representation further enables downstream tasks such as recoloring and embedding texture, supporting content creation, and illustration editing. Code is available at: https://github.com/zty0304/Anime-layer-decomposition
翻译:近期生成式图像编辑方法采用分层表示来缓解栅格图像的纠缠特性并提升可控性,通常依赖于基于对象的分割策略。然而,此类方法可能难以捕捉人类创作图像(如动漫插画)的结构化与风格化特性。为解决该问题,我们提出一种面向动漫作品插画制作的工作流感知结构化图层分解框架。受动漫制作流程的启发,我们的方法将插画分解为具有语义含义的制作图层,包括线稿、平涂色、阴影与高光。为解耦所有图层,我们引入了轻量级图层语义嵌入,为各图层提供特定任务指导。此外,我们设计了一组分层损失函数来监督各独立图层的训练过程。为克服真实分层数据缺失的问题,我们构建了一个模拟标准动漫制作流程的高质量插画数据集。实验表明,使用我们的方法能够实现精确且视觉连贯的图层分解。我们相信,所得的分层表示能够进一步支持重着色、纹理嵌入等下游任务,从而促进内容创作与插画编辑。代码发布于:https://github.com/zty0304/Anime-layer-decomposition