In recent years, conditional image synthesis has attracted growing attention due to its controllability in the image generation process. Although recent works have achieved realistic results, most of them have difficulty handling fine-grained styles with subtle details. To address this problem, a novel normalization module, named Detailed Region-Adaptive Normalization~(DRAN), is proposed. It adaptively learns both fine-grained and coarse-grained style representations. Specifically, we first introduce a multi-level structure, Spatiality-aware Pyramid Pooling, to guide the model to learn coarse-to-fine features. Then, to adaptively fuse different levels of styles, we propose Dynamic Gating, making it possible to adaptively fuse different levels of styles according to different spatial regions. Finally, we collect a new makeup dataset (Makeup-Complex dataset) that contains a wide range of complex makeup styles with diverse poses and expressions. To evaluate the effectiveness and show the general use of our method, we conduct a set of experiments on makeup transfer and semantic image synthesis. Quantitative and qualitative experiments show that equipped with DRAN, simple baseline models are able to achieve promising improvements in complex style transfer and detailed texture synthesis. Both the code and the proposed dataset will be available at https://github.com/Yueming6568/DRAN-makeup.git.
翻译:近年来,条件图像合成因其在图像生成过程中的可控性而受到越来越多的关注。尽管近期研究已取得逼真效果,但大多数方法难以处理具有细微细节的精细风格。为解决此问题,本文提出一种新颖的归一化模块——详细区域自适应归一化(DRAN),它能自适应地学习粗粒度与细粒度的风格表示。具体而言,我们首先引入一种多层结构——空间感知金字塔池化(Spatiality-aware Pyramid Pooling),引导模型学习从粗到细的特征。随后,为自适应融合不同层次的风格,我们提出动态门控(Dynamic Gating),使其能够根据不同的空间区域自适应地融合不同层次的风格。最后,我们收集了一个新的化妆数据集(Makeup-Complex数据集),其中包含大量具有多样姿态和表情的复杂化妆风格。为评估方法的有效性并展示其通用性,我们在化妆迁移和语义图像合成上开展了一系列实验。定量与定性实验表明,配备DRAN的简单基线模型能够在复杂风格迁移和精细纹理合成中实现显著改进。相关代码及数据集将在https://github.com/Yueming6568/DRAN-makeup.git 公布。