Crafting a rich and unique environment is crucial for fictional world-building, but can be difficult to achieve since illustrating a world from scratch requires time and significant skill. We investigate the use of recent multi-modal image generation systems to enable users iteratively visualize and modify elements of their fictional world using a combination of text input, sketching, and region-based filling. WorldSmith enables novice world builders to quickly visualize a fictional world with layered edits and hierarchical compositions. Through a formative study (4 participants) and first-use study (13 participants) we demonstrate that WorldSmith offers more expressive interactions with prompt-based models. With this work, we explore how creatives can be empowered to leverage prompt-based generative AI as a tool in their creative process, beyond current "click-once" prompting UI paradigms.
翻译:构建丰富且独特的环境对于虚构世界构建至关重要,但由于从头开始设计世界需要时间和较高技能,这一目标往往难以实现。我们研究如何利用最新的多模态图像生成系统,使用户能够通过文本输入、草图绘制和基于区域的填充相结合的方式,迭代式地可视化和修改其虚构世界中的元素。WorldSmith使新手世界构建者能够通过分层编辑和层次化组合快速可视化虚拟世界。通过形成性研究(4名参与者)和首次使用研究(13名参与者),我们证明WorldSmith为基于提示的模型提供了更具表达力的交互方式。通过这项工作,我们探索了创意工作者如何能够将基于提示的生成式AI作为其创作过程中的工具——超越当前“一键生成”的提示式用户界面范式。