In the field of graphic design, automating the integration of design elements into a cohesive multi-layered artwork not only boosts productivity but also paves the way for the democratization of graphic design. One existing practice is Graphic Layout Generation (GLG), which aims to layout sequential design elements. It has been constrained by the necessity for a predefined correct sequence of layers, thus limiting creative potential and increasing user workload. In this paper, we present Hierarchical Layout Generation (HLG) as a more flexible and pragmatic setup, which creates graphic composition from unordered sets of design elements. To tackle the HLG task, we introduce Graphist, the first layout generation model based on large multimodal models. Graphist efficiently reframes the HLG as a sequence generation problem, utilizing RGB-A images as input, outputs a JSON draft protocol, indicating the coordinates, size, and order of each element. We develop new evaluation metrics for HLG. Graphist outperforms prior arts and establishes a strong baseline for this field. Project homepage: https://github.com/graphic-design-ai/graphist
翻译:在图形设计领域,将设计元素自动整合为具有层次感的连贯艺术作品,不仅能提升生产效率,更为图形设计的普及化铺平了道路。当前的一种实践是图形布局生成(GLG),其目标是对序列化设计元素进行布局。但该方法受限于需要预定义正确的图层顺序,从而限制了创作潜力并增加了用户工作量。本文提出层次化布局生成(HLG)作为一种更灵活实用的设置,它能够从无序的设计元素集合中生成图形构图。为解决HLG任务,我们提出了Graphist——首个基于大语言多模态模型的布局生成模型。Graphist将HLG高效重构为序列生成问题,以RGB-A图像作为输入,输出JSON草案协议,指明每个元素的坐标、大小和顺序。我们针对HLG开发了新的评估指标。Graphist优于先前方法,为该领域建立了强基线。项目主页:https://github.com/graphic-design-ai/graphist