Adjusting the photo color to associate with some design elements is an essential way for a graphic design to effectively deliver its message and make it aesthetically pleasing. However, existing tools and previous works face a dilemma between the ease of use and level of expressiveness. To this end, we introduce an interactive language-based approach for photo recoloring, which provides an intuitive system that can assist both experts and novices on graphic design. Given a graphic design containing a photo that needs to be recolored, our model can predict the source colors and the target regions, and then recolor the target regions with the source colors based on the given language-based instruction. The multi-granularity of the instruction allows diverse user intentions. The proposed novel task faces several unique challenges, including: 1) color accuracy for recoloring with exactly the same color from the target design element as specified by the user; 2) multi-granularity instructions for parsing instructions correctly to generate a specific result or multiple plausible ones; and 3) locality for recoloring in semantically meaningful local regions to preserve original image semantics. To address these challenges, we propose a model called LangRecol with two main components: the language-based source color prediction module and the semantic-palette-based photo recoloring module. We also introduce an approach for generating a synthetic graphic design dataset with instructions to enable model training. We evaluate our model via extensive experiments and user studies. We also discuss several practical applications, showing the effectiveness and practicality of our approach. Code and data for this paper are at: https://zhenwwang.github.io/langrecol.
翻译:调整照片颜色以使其与某些设计元素相关联,是平面设计有效传达信息并提升美感的重要方式。然而,现有工具和先前研究在易用性与表现力之间存在两难困境。为此,我们提出了一种基于交互式语言的照片重新着色方法,该系统直观易用,可协助专家和初学者进行平面设计。给定一张包含需重新着色照片的平面设计图,我们的模型能预测源颜色和目标区域,然后根据基于语言的指令,用源颜色对目标区域进行重新着色。指令的多粒度特性允许用户表达多样化的意图。这项新颖任务面临若干独特挑战,包括:1)颜色准确性,即使用户指定的目标设计元素中的颜色完全相同,也要精确重新着色;2)多粒度指令,需正确解析指令以生成特定结果或多个合理结果;3)局部性,即在语义上有意义的局部区域进行重新着色,以保留原始图像语义。为应对这些挑战,我们提出名为LangRecol的模型,包含两个主要组件:基于语言的源颜色预测模块和基于语义调色板的照片重新着色模块。我们还引入一种方法,生成带有指令的合成平面设计数据集以支持模型训练。我们通过大量实验和用户研究评估模型,并讨论多项实际应用,展示方法的有效性和实用性。本文代码和数据见:https://zhenwwang.github.io/langrecol。