Scientific AI agents can autonomously carry out complex research workflows, yet these unfolded workflows often remain difficult for humans to inspect and review, limiting interpretable, controllable and effective human-AI collaboration. To address this challenge, we present a monitoring and visualization framework that records fine-grained execution events and organizes them into a directed graph that makes agent workflows explicit as they proceed. The system records intermediate steps (e.g. tool calls and code executions), and renders them as real-time updated visual traces that expose workflow structure. This allows users to examine how results are produced, identify where failures emerge, and better understand agent behavior across different stages of the research process. We conduct an evaluation on complex research tasks with domain experts of interdisciplinary backgrounds in AI, neuroscience, and biology. Experts report that structured traces visualization improves understanding of agent workflows, perceived interpretability, and usability for analysis and further interaction.
翻译:科学人工智能智能体能够自主执行复杂的研究工作流程,然而这些展开的工作流程往往仍难以被人类检查和审查,限制了可解释、可控且有效的人机协作。为应对这一挑战,我们提出一套监控与可视化框架,该框架记录细粒度的执行事件并将其组织成有向图,从而在智能体工作流程进行时将其显式化。系统记录中间步骤(例如工具调用和代码执行),并将其渲染为实时更新的可视化踪迹,揭示工作流程的结构。这使得用户能够检查结果是如何产生的,识别故障出现在何处,并更好地理解智能体在研究过程不同阶段的行为。我们与跨学科领域的领域专家(人工智能、神经科学和生物学)合作,对复杂研究任务进行了评估。专家报告称,结构化踪迹可视化提升了他们对智能体工作流程的理解、感知的可解释性,以及用于分析和进一步交互的可用性。