Recent generative image editing methods adopt layered representations to mitigate the entangled nature of raster images and improve controllability, typically relying on object-based segmentation. However, such strategies may fail to capture the structural and stylized properties of human-created images, such as anime illustrations. To solve this issue, we propose a workflow-aware structured layer decomposition framework tailored to the illustration production of anime artwork. Inspired by the creation pipeline of anime production, our method decomposes the illustration into semantically meaningful production layers, including line art, flat color, shadow, and highlight. To decouple all these layers, we introduce lightweight layer semantic embeddings to provide specific task guidance for each layer. Furthermore, a set of layer-wise losses is incorporated to supervise the training process of individual layers. To overcome the lack of ground-truth layered data, we construct a high-quality illustration dataset that simulated the standard anime production workflow. Experiments demonstrate that the accurate and visually coherent layer decompositions were achieved by using our method. We believe that the resulting layered representation further enables downstream tasks such as recoloring and embedding texture, supporting content creation, and illustration editing. Code is available at: https://github.com/zty0304/Anime-layer-decomposition
翻译:近年来,生成式图像编辑方法采用分层表示来缓解光栅图像中的特征纠缠问题并提升可控性,通常依赖于基于对象的分割策略。然而,这类方法难以捕捉人工创作图像(如动漫插画)的结构化与风格化特性。针对该问题,我们提出一种面向动漫插画艺术制作的工作流感知式结构化图层分解框架。受动漫制作流程的启发,该方法将插画分解为具有语义意义的生产图层,包括线稿、平涂色彩、阴影和高光。为实现所有图层的解耦,我们引入轻量级图层语义嵌入,为每个图层提供特定任务引导。此外,我们设计了一组逐层损失函数来监督各图层的训练过程。为克服真实图层数据缺失的问题,我们构建了一个模拟标准动漫制作流程的高质量插画数据集。实验表明,该方法能够实现准确且视觉一致的图层分解。我们相信,由此获得的分层表示可进一步支持重着色、纹理嵌入等下游任务,助力内容创作与插画编辑。代码开源地址:https://github.com/zty0304/Anime-layer-decomposition