Figures as Interfaces: Toward LLM-Native Artifacts for Scientific Discovery

Large language models (LLMs) are transforming scientific workflows, not only through their generative capabilities but also through their emerging ability to use tools, reason about data, and coordinate complex analytical tasks. Yet in most human-AI collaborations, the primary outputs, figures, are still treated as static visual summaries: once rendered, they are handled by both humans and multimodal LLMs as images to be re-interpreted from pixels or captions. The emergent capabilities of LLMs open an opportunity to fundamentally rethink this paradigm. In this paper, we introduce the concept of LLM-native figures: data-driven artifacts that are simultaneously human-legible and machine-addressable. Unlike traditional plots, each artifact embeds complete provenance: the data subset, analytical operations and code, and visualization specification used to generate it. As a result, an LLM can "see through" the figure--tracing selections back to their sources, generating code to extend analyses, and orchestrating new visualizations through natural-language instructions or direct manipulation. We implement this concept through a hybrid language-visual interface that integrates LLM agents with a bidirectional mapping between figures and underlying data. Using the science of science domain as a testbed, we demonstrate that LLM-native figures can accelerate discovery, improve reproducibility, and make reasoning transparent across agents and users. More broadly, this work establishes a general framework for embedding provenance, interactivity, and explainability into the artifacts of modern research, redefining the figure not as an end product, but as an interface for discovery. For more details, please refer to the demo video available at www.llm-native-figure.com.

翻译：大语言模型（LLMs）正在重塑科研工作流程，这不仅体现在其生成能力上，更源于其日益增强的工具调用、数据推理与复杂分析任务协调能力。然而在现有的人机协作实践中，主要产出物——可视化图形——仍被视作静态视觉摘要：一旦生成，人类与多模态LLM都需将其作为图像，通过像素或图注重新解读。LLM涌现的能力为彻底重构这一范式创造了契机。本文提出"LLM原生图形"概念：这是一种兼具人类可读性与机器可寻址性的数据驱动制品。与传统图表不同，每个制品嵌入完整的溯源信息，包括生成该图所涉及的数据子集、分析操作、代码及可视化规范。由此，LLM能够"透视"图形——将选定元素追溯至数据源，生成代码扩展分析，并通过自然语言指令或直接操控协调生成新的可视化。我们通过融合语言与视觉的混合界面实现该概念，将LLM智能体与图形-数据的双向映射机制相结合。以科学学领域为试验场，我们证明LLM原生图形可加速科学发现、提升可复现性，并实现跨智能体与用户的推理透明化。更广泛而言，本研究建立了将可溯源性、交互性与可解释性嵌入现代研究产物的通用框架，重新定义图形不再作为终端产品，而是作为科学发现的界面。详情请参阅演示视频（www.llm-native-figure.com）。