We present VOICE, a novel approach to science communication that connects large language models' (LLM) conversational capabilities with interactive exploratory visualization. VOICE introduces several innovative technical contributions that drive our conversational visualization framework. Our foundation is a pack-of-bots that can perform specific tasks, such as assigning tasks, extracting instructions, and generating coherent content. We employ fine-tuning and prompt engineering techniques to tailor bots' performance to their specific roles and accurately respond to user queries. Our interactive text-to-visualization method generates a flythrough sequence matching the content explanation. Besides, natural language interaction provides capabilities to navigate and manipulate the 3D models in real-time. The VOICE framework can receive arbitrary voice commands from the user and respond verbally, tightly coupled with corresponding visual representation with low latency and high accuracy. We demonstrate the effectiveness of our approach by applying it to the molecular visualization domain: analyzing three 3D molecular models with multi-scale and multi-instance attributes. We finally evaluate VOICE with the identified educational experts to show the potential of our approach. All supplemental materials are available at https://osf.io/g7fbr.
翻译:我们提出VOICE,一种连接大语言模型对话能力与交互式探索性可视化的科学传播新方法。VOICE引入若干创新技术贡献,驱动我们的对话式可视化框架。其基础是一组可执行特定任务的机器人包——包括任务分配、指令提取与连贯内容生成。我们采用微调与提示工程技术定制各机器人的角色表现,并准确响应用户查询。交互式文本到可视化方法能生成与内容解释匹配的飞越序列。此外,自然语言交互支持实时导航与操作三维模型。VOICE框架可接收用户任意语音指令,并以低延迟、高精度的方式实现语音响应与对应视觉表征的紧密耦合。我们通过将其应用于分子可视化领域展示方法有效性:分析具有多尺度、多实例属性的三维分子模型。最终通过教育专家评估验证VOICE的潜力。所有补充材料见 https://osf.io/g7fbr。