We characterize and demonstrate how the principles of direct manipulation can improve interaction with large language models. This includes: continuous representation of generated objects of interest; reuse of prompt syntax in a toolbar of commands; manipulable outputs to compose or control the effect of prompts; and undo mechanisms. This idea is exemplified in DirectGPT, a user interface layer on top of ChatGPT that works by transforming direct manipulation actions to engineered prompts. A study shows participants were 50% faster and relied on 50% fewer and 72% shorter prompts to edit text, code, and vector images compared to baseline ChatGPT. Our work contributes a validated approach to integrate LLMs into traditional software using direct manipulation.
翻译:我们阐释并演示了直接操纵原理如何改善与大语言模型的交互体验。该原理包括:持续可视化呈现所生成的目标对象;在命令工具栏中复用提示语法;通过可操纵输出编排或控制提示效果;以及撤销机制。这一理念在DirectGPT中得到具体实现——作为ChatGPT上层的用户界面层,其通过将直接操纵动作转化为工程化提示来运作。研究表明,与基准ChatGPT相比,参与者在编辑文本、代码和矢量图像时速度提升50%,提示次数减少50%,提示长度缩短72%。我们的工作为通过直接操纵将大语言模型集成到传统软件中提供了一种经过验证的方法。