Large Language Model (LLM)-based agents have shown effectiveness across many applications. However, their use in data science scenarios requiring solving long-term interconnected tasks, dynamic data adjustments and domain expertise remains challenging. Previous approaches primarily focus on individual tasks, making it difficult to assess the complete data science workflow. Moreover, they struggle to handle real-time changes in intermediate data and fail to adapt dynamically to evolving task dependencies inherent to data science problems. In this paper, we present Data Interpreter, an LLM-based agent designed to automatically solve various data science problems end-to-end. Our Data Interpreter incorporates two key modules: 1) Hierarchical Graph Modeling, which breaks down complex problems into manageable subproblems, enabling dynamic node generation and graph optimization; and 2) Programmable Node Generation, a technique that refines and verifies each subproblem to iteratively improve code generation results and robustness. Extensive experiments consistently demonstrate the superiority of Data Interpreter. On InfiAgent-DABench, it achieves a 25% performance boost, raising accuracy from 75.9% to 94.9%. For machine learning and open-ended tasks, it improves performance from 88% to 95%, and from 60% to 97%, respectively. Moreover, on the MATH dataset, Data Interpreter achieves remarkable performance with a 26% improvement compared to state-of-the-art baselines. The code is available at https://github.com/geekan/MetaGPT.
翻译:基于大语言模型(LLM)的智能体已在众多应用场景中展现出卓越效能。然而,在需要解决长期关联任务、动态数据调整与领域专业知识的数据科学场景中,其应用仍面临挑战。现有方法主要聚焦于独立任务,难以评估完整的数据科学工作流。此外,这些方法难以处理中间数据的实时变化,也无法动态适应数据科学问题固有的任务依赖关系演化。本文提出数据解释器(Data Interpreter),一种基于LLM的智能体,旨在端到端自动解决各类数据科学问题。该智能体包含两个核心模块:1)层次化图建模——将复杂问题分解为可管理的子问题,实现动态节点生成与图结构优化;2)可编程节点生成——通过细粒度验证与迭代优化提升代码生成结果的准确性与鲁棒性。大量实验一致证明数据解释器的优越性:在InfiAgent-DABench基准上实现25%的性能提升,准确率从75.9%提升至94.9%;在机器学习与开放式任务中,性能分别从88%提升至95%、从60%提升至97%;在MATH数据集上,相较现有最优基线模型取得26%的显著性能提升。代码已开源:https://github.com/geekan/MetaGPT。