Scientific data visualization plays a crucial role in research by enabling the direct display of complex information and assisting researchers in identifying implicit patterns. Despite its importance, the use of Large Language Models (LLMs) for scientific data visualization remains rather unexplored. In this study, we introduce MatPlotAgent, an efficient model-agnostic LLM agent framework designed to automate scientific data visualization tasks. Leveraging the capabilities of both code LLMs and multi-modal LLMs, MatPlotAgent consists of three core modules: query understanding, code generation with iterative debugging, and a visual feedback mechanism for error correction. To address the lack of benchmarks in this field, we present MatPlotBench, a high-quality benchmark consisting of 100 human-verified test cases. Additionally, we introduce a scoring approach that utilizes GPT-4V for automatic evaluation. Experimental results demonstrate that MatPlotAgent can improve the performance of various LLMs, including both commercial and open-source models. Furthermore, the proposed evaluation method shows a strong correlation with human-annotated scores.
翻译:科学数据可视化在研究领域至关重要,它能够直接展示复杂信息并帮助研究人员识别隐含模式。尽管其重要性显著,但目前大型语言模型在科学数据可视化领域的应用仍鲜有探索。本研究提出MatPlotAgent,一个高效且与模型无关的LLM智能体框架,旨在自动化科学数据可视化任务。该框架结合了代码LLM与多模态LLM的能力,包含三个核心模块:查询理解、带迭代调试的代码生成,以及用于纠错的视觉反馈机制。针对该领域缺乏基准的问题,我们构建了MatPlotBench——由100个人工验证测试用例组成的高质量基准测试集。此外,我们提出了一种利用GPT-4V进行自动评估的评分方法。实验结果表明,MatPlotAgent能够提升包括商业模型与开源模型在内的多种LLM的性能。同时,所提出的评估方法与人工标注分数展现出强相关性。