The role of interface design on prompt-mediated creativity in Generative AI

Generative AI for the creation of images is becoming a staple in the toolkit of digital artists and visual designers. The interaction with these systems is mediated by \emph{prompting}, a process in which users write a short text to describe the desired image's content and style. The study of prompts offers an unprecedented opportunity to gain insight into the process of human creativity. Yet, our understanding of how people use them remains limited. We analyze more than 145,000 prompts from the logs of two Generative AI platforms (Stable Diffusion and Pick-a-Pic) to shed light on how people \emph{explore} new concepts over time, and how their exploration might be influenced by different design choices in human-computer interfaces to Generative AI. We find that users exhibit a tendency towards exploration of new topics over exploitation of concepts visited previously. However, a comparative analysis of the two platforms, which differ both in scope and functionalities, reveals some stark differences. Features diverting user focus from prompting and providing instead shortcuts for quickly generating image variants are associated with a considerable reduction in both exploration of novel concepts and detail in the submitted prompts. These results carry direct implications for the design of human interfaces to Generative AI and raise new questions regarding how the process of prompting should be aided in ways that best support creativity.

翻译：生成式AI用于图像创作正逐渐成为数字艺术家和视觉设计师工具箱中的标准配置。用户与这些系统的交互通过“提示”过程实现，即用户编写简短文本描述所需图像的内容和风格。提示研究为洞察人类创造力过程提供了前所未有的机会。然而，我们对人们如何使用提示的理解仍然有限。我们分析了两个生成式AI平台（Stable Diffusion和Pick-a-Pic）日志中超过145,000条提示，以揭示人们如何随时间“探索”新概念，以及这种探索可能如何受到生成式AI人机界面不同设计选择的影响。我们发现，用户倾向于探索新主题，而非重复利用之前接触过的概念。然而，对这两个在范围和功能上存在差异的平台进行对比分析后，发现了一些显著差异。那些将用户注意力从提示编写转移开，转而提供快速生成图像变体捷径的功能，与用户提交提示中新概念探索程度和细节丰富度的显著降低相关。这些结果直接影响了生成式AI人机界面的设计，并提出了关于如何以最佳方式支持创造力的提示过程辅助的新问题。

相关内容

生成式人工智能

关注 38

生成式人工智能是利用复杂的算法、模型和规则，从大规模数据集中学习，以创造新的原创内容的人工智能技术。这项技术能够创造文本、图片、声音、视频和代码等多种类型的内容，全面超越了传统软件的数据处理和分析能力。2022年末，OpenAI推出的ChatGPT标志着这一技术在文本生成领域取得了显著进展，2023年被称为生成式人工智能的突破之年。这项技术从单一的语言生成逐步向多模态、具身化快速发展。在图像生成方面，生成系统在解释提示和生成逼真输出方面取得了显著的进步。同时，视频和音频的生成技术也在迅速发展，这为虚拟现实和元宇宙的实现提供了新的途径。生成式人工智能技术在各行业、各领域都具有广泛的应用前景。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日