Large-scale Text-to-Image Generation Models for Visual Artists' Creative Works

Large-scale Text-to-image Generation Models (LTGMs) (e.g., DALL-E), self-supervised deep learning models trained on a huge dataset, have demonstrated the capacity for generating high-quality open-domain images from multi-modal input. Although they can even produce anthropomorphized versions of objects and animals, combine irrelevant concepts in reasonable ways, and give variation to any user-provided images, we witnessed such rapid technological advancement left many visual artists disoriented in leveraging LTGMs more actively in their creative works. Our goal in this work is to understand how visual artists would adopt LTGMs to support their creative works. To this end, we conducted an interview study as well as a systematic literature review of 72 system/application papers for a thorough examination. A total of 28 visual artists covering 35 distinct visual art domains acknowledged LTGMs' versatile roles with high usability to support creative works in automating the creation process (i.e., automation), expanding their ideas (i.e., exploration), and facilitating or arbitrating in communication (i.e., mediation). We conclude by providing four design guidelines that future researchers can refer to in making intelligent user interfaces using LTGMs.

翻译：大规模文本到图像生成模型（LTGMs）（如DALL-E）是在大规模数据集上训练的自监督深度学习模型，已展现出从多模态输入生成高质量开放域图像的能力。尽管这些模型能够生成拟人化的物体和动物形象，以合理方式结合无关概念，并对用户提供的任意图像进行变体创作，但我们观察到，这种技术快速发展使得许多视觉艺术家在更积极地利用LTGMs支持其创意作品时感到无所适从。本研究旨在理解视觉艺术家如何采用LTGMs辅助其创意工作。为此，我们通过访谈研究及对72篇系统/应用论文的系统文献综述进行了深入考察。覆盖35个视觉艺术领域的28位艺术家一致认为，LTGMs具有多功能角色和高可用性，可通过自动化创作过程（即自动化）、拓展创意构思（即探索），以及促进或调节沟通交流（即中介）来支持创意工作。最后，我们提出了四项设计指南，供未来研究者参考以构建基于LTGMs的智能用户界面。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日