Humans think visually-we remember in images, dream in pictures, and use visual metaphors to communicate. Yet, most creative writing tools remain text-centric, limiting how authors plan and translate ideas. We present Vistoria, a system for synchronized text-image co-editing in fictional story writing that treats visuals and text as coequal narrative materials. A formative Wizard-of-Oz co-design study with 10 story writers revealed how sketches, images, and annotations serve as essential instruments for ideation and organization. Drawing on theories of Instrumental Interaction and Structural Mapping, Vistoria introduces multimodal operations-lasso, collage, filters, and perspective shifts that enable seamless narrative exploration across modalities. A controlled study with 12 participants shows that co-editing enhances expressiveness, immersion, and collaboration, enabling writers to explore divergent directions, embrace serendipitous randomness, and trace evolving storylines. While multimodality increased cognitive demand, participants reported stronger senses of authorship and agency. These findings demonstrate how multimodal co-editing expands creative potential by balancing abstraction and concreteness in narrative development.
翻译:人类以视觉方式思考——我们通过图像记忆、以画面做梦,并运用视觉隐喻进行交流。然而,大多数创意写作工具仍以文本为中心,限制了作者规划与转化创意的方式。本文提出Vistoria系统,该系统支持虚构故事写作中的同步文本-图像协同编辑,将视觉内容与文本视为平等的叙事材料。一项包含10位故事作者的Wizard-of-Oz协同设计形成性研究表明,草图、图像和注释如何作为构思与组织的重要工具。基于工具化交互与结构映射理论,Vistoria引入了多模态操作——套索、拼贴、滤镜及视角切换——实现跨模态的无缝叙事探索。一项包含12位参与者的对照实验表明,协同编辑能提升表达力、沉浸感与协作性,使作者能够探索发散方向、接纳随机灵感并追踪演进的情节脉络。尽管多模态操作增加了认知负荷,参与者反馈获得了更强的作者身份认同与创作自主性。这些发现证明,多模态协同编辑通过平衡叙事发展中的抽象性与具体性,有效拓展了创作潜能。