Creating layouts is a fundamental step in graphic design. In this work, we propose to use text as the guidance to create graphic layouts, i.e., Text-to-Layout, aiming to lower the design barriers. Text-to-Layout is a challenging task, because it needs to consider the implicit, combined, and incomplete layout constraints from text, each of which has not been studied in previous work. To address this, we present a two-stage approach, named parse-then-place. The approach introduces an intermediate representation (IR) between text and layout to represent diverse layout constraints. With IR, Text-to-Layout is decomposed into a parse stage and a place stage. The parse stage takes a textual description as input and generates an IR, in which the implicit constraints from the text are transformed into explicit ones. The place stage generates layouts based on the IR. To model combined and incomplete constraints, we use a Transformer-based layout generation model and carefully design a way to represent constraints and layouts as sequences. Besides, we adopt the pretrain-then-finetune strategy to boost the performance of the layout generation model with large-scale unlabeled layouts. To evaluate our approach, we construct two Text-to-Layout datasets and conduct experiments on them. Quantitative results, qualitative analysis, and user studies demonstrate the effectiveness of our approach.
翻译:布局设计是图形设计的基础步骤。本研究提出以文本为引导生成图形布局(即文本到布局任务),旨在降低设计门槛。文本到布局是一项极具挑战性的任务,因为需要处理文本中隐含、组合及不完整的布局约束——这些约束类型均未被此前研究涉及。为此,我们提出一种名为"解析-放置"的两阶段方法。该方法在文本与布局之间引入中间表示(IR)来表征多样化的布局约束。通过IR,文本到布局被分解为解析阶段和放置阶段:解析阶段以文本描述为输入生成IR,将文本中的隐式约束转化为显式约束;放置阶段则基于IR生成布局。为建模组合式与不完整约束,我们采用基于Transformer的布局生成模型,并精心设计了约束与布局的序列化表示方式。此外,我们采用预训练-微调策略,利用大规模无标注布局数据提升模型性能。为评估该方法,我们构建了两个文本到布局数据集并开展实验。定量结果、定性分析与用户研究共同验证了所提方法的有效性。