Conditional graphic layout generation, which generates realistic layouts according to user constraints, is a challenging task that has not been well-studied yet. First, there is limited discussion about how to handle diverse user constraints flexibly and uniformly. Second, to make the layouts conform to user constraints, existing work often sacrifices generation quality significantly. In this work, we propose LayoutFormer++ to tackle the above problems. First, to flexibly handle diverse constraints, we propose a constraint serialization scheme, which represents different user constraints as sequences of tokens with a predefined format. Then, we formulate conditional layout generation as a sequence-to-sequence transformation, and leverage encoder-decoder framework with Transformer as the basic architecture. Furthermore, to make the layout better meet user requirements without harming quality, we propose a decoding space restriction strategy. Specifically, we prune the predicted distribution by ignoring the options that definitely violate user constraints and likely result in low-quality layouts, and make the model samples from the restricted distribution. Experiments demonstrate that LayoutFormer++ outperforms existing approaches on all the tasks in terms of both better generation quality and less constraint violation.
翻译:条件式图形布局生成是一项根据用户约束生成逼真布局的挑战性任务,目前尚未得到充分研究。首先,如何灵活统一地处理多样化用户约束的讨论有限;其次,现有方法为满足用户约束往往严重牺牲生成质量。本文提出LayoutFormer++以解决上述问题。首先,为灵活处理多样化约束,我们提出约束序列化方案,将不同用户约束表示为遵循预设格式的标记序列。随后,我们将条件式布局生成建模为序列到序列的转换任务,并采用Transformer作为基础架构的编码器-解码器框架。此外,为在不损害质量的前提下使布局更好符合用户需求,我们提出解码空间限制策略:通过忽略必然违反用户约束且可能导致低质量布局的选项来修剪预测分布,使模型从受限分布中采样。实验表明,LayoutFormer++在所有任务上均优于现有方法,在生成质量与约束违反率两方面均表现更优。