Recent agentic systems demonstrate that large language models can generate scientific visualizations from natural language. However, reliability remains a major limitation: systems may execute invalid operations, introduce subtle but consequential errors, or fail to request missing information when inputs are underspecified. These issues are amplified in real-world workflows, which often exceed the complexity of standard benchmarks. Ensuring reliability in autonomous visualization pipelines therefore remains an open challenge. We present TopoPilot, a reliable and extensible agentic framework for automating complex scientific visualization workflows. TopoPilot incorporates systematic guardrails and verification mechanisms to ensure reliable operation. While we focus on topological data analysis and visualization as a primary use case, the framework is designed to generalize across visualization domains. TopoPilot adopts a reliability-centered two-agent architecture. An orchestrator agent translates user prompts into workflows composed of atomic backend actions, while a verifier agent evaluates these workflows prior to execution, enforcing structural validity and semantic consistency. This separation of interpretation and verification reduces code-generation errors and enforces correctness guarantees. A modular architecture further improves robustness by isolating components and enabling seamless integration of new descriptors and domain-specific workflows without modifying the core system. To systematically address reliability, we introduce a taxonomy of failure modes and implement targeted safeguards for each class. In evaluations simulating 1,000 multi-turn conversations across 100 prompts, including adversarial and infeasible requests, TopoPilot achieves a success rate exceeding 99%, compared to under 50% for baselines without comprehensive guardrails and checks.
翻译:近期智能体系统表明,大语言模型能够根据自然语言生成科学可视化结果。然而,可靠性仍是主要瓶颈:系统可能执行无效操作、引入隐蔽但关键的误差,或在输入信息不充分时未能主动请求缺失数据。这些缺陷在复杂度远超标准基准测试的现实工作流中被进一步放大。因此,确保自主可视化管线的可靠性仍是待解决的关键难题。本文提出TopoPilot——一个可靠且可扩展的智能体框架,用于自动化复杂的科学可视化工作流。TopoPilot通过系统性防护机制与验证手段确保可靠运行。尽管我们以拓扑数据分析与可视化作为主要应用案例,该框架的设计目标是在各类可视化领域具备泛化能力。TopoPilot采用以可靠性为核心的双智能体架构:编排智能体将用户指令转化为由原子化后端操作构成的工作流,而验证智能体则在执行前对这些工作流进行评估,强制确保其结构有效性与语义一致性。这种解释与验证相分离的机制有效减少了代码生成错误,并提供了正确性保障。模块化架构通过组件隔离进一步提升鲁棒性,在不修改核心系统的情况下无缝集成新型描述符与领域专用工作流。为系统性解决可靠性问题,我们提出了故障模式分类体系,并为各类故障模式实现了针对性防护策略。在模拟100次提示词生成1000轮多轮对话的评估中(包含对抗性与不可行请求),TopoPilot实现了超过99%的成功率,而缺乏全面防护与校验机制的基线系统成功率不足50%。