Scientific visualization pipelines encode domain-specific procedural knowledge with strict execution dependencies, making their construction sensitive to missing stages, incorrect operator usage, or improper ordering. Thus, generating executable scientific visualization pipelines from natural-language descriptions remains challenging for large language models, particularly in web-based environments where visualization authoring relies on explicit code-level pipeline assembly. In this work, we investigate the reliability of LLM-based scientific visualization pipeline generation, focusing on vtk.js as a representative web-based visualization library. We propose a structure-aware retrieval-augmented generation workflow that provides pipeline-aligned vtk.js code examples as contextual guidance, supporting correct module selection, parameter configuration, and execution order. We evaluate the proposed workflow across multiple multi-stage scientific visualization tasks and LLMs, measuring reliability in terms of pipeline executability and human correction effort. To this end, we introduce correction cost as metric for the amount of manual intervention required to obtain a valid pipeline. Our results show that structured, domain-specific context substantially improves pipeline executability and reduces correction cost. We additionally provide an interactive analysis interface to support human-in-the-loop inspection and systematic evaluation of generated visualization pipelines.
翻译:科学可视化管线封装了具有严格执行依赖关系的领域特定程序化知识,使其构建过程容易因阶段缺失、算子使用错误或顺序不当而出现问题。因此,从自然语言描述生成可执行的科学可视化管线对大语言模型而言仍具挑战性,尤其在基于Web的环境中——这类环境的可视化创作依赖显式的代码级管线组装。本研究探讨了基于大语言模型的科学可视化管线生成的可靠性问题,以vtk.js作为代表性Web端可视化库展开研究。我们提出了一种具有结构感知能力的检索增强生成工作流,该工作流可提供与管线对齐的vtk.js代码示例作为上下文指导,从而支持正确的模块选择、参数配置及执行顺序。我们在多个多阶段科学可视化任务和大语言模型上评估了所提工作流,通过管线可执行性及人为修正工作量来度量可靠性。为此,我们引入修正代价这一指标,以衡量获得有效管线所需的人工干预量。实验结果表明,结构化的领域特异性上下文能显著提升管线可执行性并降低修正代价。此外,我们提供了一个交互式分析界面,支持人机协同检测与生成可视化管线的系统性评估。