Controllable layout generation refers to the process of creating a plausible visual arrangement of elements within a graphic design (e.g., document and web designs) with constraints representing design intentions. Although recent diffusion-based models have achieved state-of-the-art FID scores, they tend to exhibit more pronounced misalignment compared to earlier transformer-based models. In this work, we propose the $\textbf{LA}$yout $\textbf{C}$onstraint diffusion mod$\textbf{E}$l (LACE), a unified model to handle a broad range of layout generation tasks, such as arranging elements with specified attributes and refining or completing a coarse layout design. The model is based on continuous diffusion models. Compared with existing methods that use discrete diffusion models, continuous state-space design can enable the incorporation of differentiable aesthetic constraint functions in training. For conditional generation, we introduce conditions via masked input. Extensive experiment results show that LACE produces high-quality layouts and outperforms existing state-of-the-art baselines.
翻译:可控布局生成是指在图形设计(如文档和网页设计)中,根据代表设计意图的约束条件,创建视觉元素合理排列的过程。尽管近年来基于扩散的模型在FID评分上取得了最先进的成果,但与早期基于Transformer的模型相比,它们往往表现出更明显的对齐偏差。本研究提出了一种统一模型——布局约束扩散模型(LACE),可处理多种布局生成任务,例如按指定属性排列元素、优化或补全初步布局设计。该模型基于连续扩散模型,与现有使用离散扩散模型的方法相比,其连续状态空间设计能够在训练中融入可微的美学约束函数。针对条件生成,我们通过掩码输入引入条件。大量实验结果表明,LACE能够生成高质量布局,并优于现有最先进的基准模型。