LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer - 专知论文

会员服务 ·

0

多峰值 · 设计 · 变换 · Automator · MoDELS ·

2023 年 1 月 30 日

LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer

翻译：LayoutDETR：检测Transformer是一种优秀的多模态布局设计师

Ning Yu,Chia-Chih Chen,Zeyuan Chen,Rui Meng,Gang Wu,Paul Josel,Juan Carlos Niebles,Caiming Xiong,Ran Xu

Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs are skill-demanding, time-consuming, and non-scalable to batch production. Although generative models emerge to make design automation no longer utopian, it remains non-trivial to customize designs that comply with designers' multimodal desires, i.e., constrained by background images and driven by foreground contents. In this study, we propose \textit{LayoutDETR} that inherits the high quality and realism from generative modeling, in the meanwhile reformulating content-aware requirements as a detection problem: we learn to detect in a background image the reasonable locations, scales, and spatial relations for multimodal elements in a layout. Experiments validate that our solution yields new state-of-the-art performance for layout generation on public benchmarks and on our newly-curated ads banner dataset. For practical usage, we build our solution into a graphical system that facilitates user studies. We demonstrate that our designs attract more subjective preferences than baselines by significant margins. Our code, models, dataset, graphical system, and demos are available at https://github.com/salesforce/LayoutDETR.

翻译：图形布局设计在视觉传达中扮演着至关重要的角色。然而，手工制作布局设计既需要专业技能，又耗费时间，且难以规模化批量生产。尽管生成式模型的出现使设计自动化不再遥不可及，但定制符合设计师多模态需求（即受背景图像约束并由前景内容驱动）的设计仍然具有挑战性。在本研究中，我们提出\textit{LayoutDETR}，它继承了生成式模型的高质量与逼真性，同时将内容感知需求重新定义为检测问题：我们学习在背景图像中检测布局中多模态元素的合理位置、尺度及空间关系。实验证明，我们的方法在公开基准测试以及我们新整理的海报横幅数据集上，均取得了布局生成领域的最新最优性能。为便于实际应用，我们将该方案构建为一个图形系统，支持用户研究。我们证明，与基线方法相比，我们设计的方案在主观偏好上获得了显著性优势。我们的代码、模型、数据集、图形系统及演示程序均可通过https://github.com/salesforce/LayoutDETR获取。

0

相关内容

多峰值

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

动脉粥样硬化中oxLDL/CD36受体介导VSMC炎症表型转化的作用和机制

国家自然科学基金

0+阅读 · 2014年12月31日

面向DVS-MVI的多核SoC测试结构设计与测试调度算法协同优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-140促进CYP2J2基因表达对动脉粥样硬化中血管炎症的调控作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

TRAF1在心肌梗死后心室重构中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

多功能MR分子探针携带rt-PA靶向溶栓治疗的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

离子通道TRPM2在血管壁内膜增生中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

有机铁电材料的纳米压印加工及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

数字图像被动盲取证方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

OcTr: Octree-based Transformer for 3D Object Detection

Arxiv

0+阅读 · 2023年3月22日

PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Arxiv

0+阅读 · 2023年3月21日

The Multiscale Surface Vision Transformer

The Multiscale Surface Vision Transformer

Arxiv

0+阅读 · 2023年3月21日

Latent Video Diffusion Models for High-Fidelity Long Video Generation

Arxiv

0+阅读 · 2023年3月20日

FullFormer: Generating Shapes Inside Shapes

Arxiv

0+阅读 · 2023年3月20日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Arxiv

11+阅读 · 2018年7月16日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

VIP会员

文章信息

相关主题

最新内容

博士论文 | 面向大模型推理的内存高效算法

博士论文 | 面向大模型推理的内存高效算法

专知会员服务

2+阅读 · 7月27日

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

专知会员服务

3+阅读 · 7月27日

《无人系统互操作性导论——无人系统联合架构（JAUS）》

《无人系统互操作性导论——无人系统联合架构（JAUS）》

专知会员服务

9+阅读 · 7月27日

美空军新型反无人机部队初探

美空军新型反无人机部队初探

专知会员服务

5+阅读 · 7月27日

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

《对抗性电磁环境下远程巡飞弹作战的安全指挥与控制数据链》

专知会员服务

4+阅读 · 7月27日

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

《北约下一代建模与仿真（NexGen M&S）计划》2026年69页

专知会员服务

3+阅读 · 7月27日

《防空交战流程的概率建模研究》

《防空交战流程的概率建模研究》

专知会员服务

7+阅读 · 7月27日

ICML 2026 教程 | 数值优化理论还重要吗？

ICML 2026 教程 | 数值优化理论还重要吗？

专知会员服务

6+阅读 · 7月26日

ICM 2026 | 陶哲轩：人工智能时代的数学

ICM 2026 | 陶哲轩：人工智能时代的数学

专知会员服务

9+阅读 · 7月26日

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

《面向可扩展高韧性无人机集群网络的速度感知分层通信框架》

专知会员服务

8+阅读 · 7月26日

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

《面向概率推理的可定制战术引擎及其在军事任务规划中的应用》

专知会员服务

11+阅读 · 7月26日

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

《先进防空系统选型战略框架：基于巴基斯坦的实证启示》

专知会员服务

8+阅读 · 7月26日

《反无人机交战场景下的战斗归零研究》

《反无人机交战场景下的战斗归零研究》

专知会员服务

7+阅读 · 7月26日

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

霍尔木兹与不对称作战时代：水雷、无人系统与海军力量的重新定义

专知会员服务

4+阅读 · 7月26日

博士论文 | 用代码结构感知方法推进代码大模型

博士论文 | 用代码结构感知方法推进代码大模型

专知会员服务

6+阅读 · 7月25日

相关VIP内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

论文解读 | 从预训练到后训练：理解大模型推理能力如何形成

美空军新型反无人机部队初探

博士论文 | 面向大模型推理的内存高效算法

《无人系统互操作性导论——无人系统联合架构（JAUS）》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

OcTr: Octree-based Transformer for 3D Object Detection

Arxiv

0+阅读 · 2023年3月22日

PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

Arxiv

0+阅读 · 2023年3月21日

The Multiscale Surface Vision Transformer

The Multiscale Surface Vision Transformer

Arxiv

0+阅读 · 2023年3月21日

Latent Video Diffusion Models for High-Fidelity Long Video Generation

Arxiv

0+阅读 · 2023年3月20日

FullFormer: Generating Shapes Inside Shapes

Arxiv

0+阅读 · 2023年3月20日

EDTER: Edge Detection with Transformer

Arxiv

11+阅读 · 2022年3月16日

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

Arxiv

19+阅读 · 2020年11月18日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Convolutional Neural Networks for Aerial Multi-Label Pedestrian Detection

Arxiv

11+阅读 · 2018年7月16日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Arxiv

10+阅读 · 2018年3月8日

相关基金

动脉粥样硬化中oxLDL/CD36受体介导VSMC炎症表型转化的作用和机制

国家自然科学基金

0+阅读 · 2014年12月31日

面向DVS-MVI的多核SoC测试结构设计与测试调度算法协同优化方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

miR-140促进CYP2J2基因表达对动脉粥样硬化中血管炎症的调控作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

TRAF1在心肌梗死后心室重构中的作用及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

Doublecortin的动态表达在骨折愈合中的作用与调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

多功能MR分子探针携带rt-PA靶向溶栓治疗的基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

离子通道TRPM2在血管壁内膜增生中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

有机铁电材料的纳米压印加工及应用研究

国家自然科学基金

0+阅读 · 2011年12月31日

数字图像被动盲取证方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员