In this work, we investigate the potential of large language models (LLMs) based agents to automate data science tasks, with the goal of comprehending task requirements, then building and training the best-fit machine learning models. Despite their widespread success, existing LLM agents are hindered by generating unreasonable experiment plans within this scenario. To this end, we present DS-Agent, a novel automatic framework that harnesses LLM agent and case-based reasoning (CBR). In the development stage, DS-Agent follows the CBR framework to structure an automatic iteration pipeline, which can flexibly capitalize on the expert knowledge from Kaggle, and facilitate consistent performance improvement through the feedback mechanism. Moreover, DS-Agent implements a low-resource deployment stage with a simplified CBR paradigm to adapt past successful solutions from the development stage for direct code generation, significantly reducing the demand on foundational capabilities of LLMs. Empirically, DS-Agent with GPT-4 achieves an unprecedented 100% success rate in the development stage, while attaining 36% improvement on average one pass rate across alternative LLMs in the deployment stage. In both stages, DS-Agent achieves the best rank in performance, costing \$1.60 and \$0.13 per run with GPT-4, respectively. Our code is open-sourced at https://github.com/guosyjlu/DS-Agent.
翻译:本文研究了基于大语言模型(LLMs)的智能体在自动化数据科学任务中的潜力,旨在理解任务需求并构建和训练最优适配的机器学习模型。尽管现有LLM智能体已取得广泛成功,但在该场景中仍受限于生成不合理的实验计划。为此,我们提出DS-Agent——一种融合LLM智能体与案例推理(CBR)的新型自动化框架。在开发阶段,DS-Agent遵循CBR框架构建自动化迭代流水线,既能灵活利用Kaggle中的专家知识,又能通过反馈机制驱动性能持续提升。此外,DS-Agent通过简化的CBR范式实现低资源部署阶段,可直接将开发阶段积累的成功方案适配为代码生成,显著降低了对LLM基础能力的需求。实验表明,采用GPT-4的DS-Agent在开发阶段成功率达100%,在部署阶段跨替代LLM的平均一次性通过率提升36%;两个阶段均取得最佳性能排名,每次运行成本分别仅需1.60美元和0.13美元(基于GPT-4)。我们的代码已开源至https://github.com/guosyjlu/DS-Agent。