DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning

In this work, we investigate the potential of large language models (LLMs) based agents to automate data science tasks, with the goal of comprehending task requirements, then building and training the best-fit machine learning models. Despite their widespread success, existing LLM agents are hindered by generating unreasonable experiment plans within this scenario. To this end, we present DS-Agent, a novel automatic framework that harnesses LLM agent and case-based reasoning (CBR). In the development stage, DS-Agent follows the CBR framework to structure an automatic iteration pipeline, which can flexibly capitalize on the expert knowledge from Kaggle, and facilitate consistent performance improvement through the feedback mechanism. Moreover, DS-Agent implements a low-resource deployment stage with a simplified CBR paradigm to adapt past successful solutions from the development stage for direct code generation, significantly reducing the demand on foundational capabilities of LLMs. Empirically, DS-Agent with GPT-4 achieves an unprecedented 100% success rate in the development stage, while attaining 36% improvement on average one pass rate across alternative LLMs in the deployment stage. In both stages, DS-Agent achieves the best rank in performance, costing \$1.60 and \$0.13 per run with GPT-4, respectively. Our code is open-sourced at https://github.com/guosyjlu/DS-Agent.

翻译：本文研究了基于大语言模型（LLMs）的智能体在自动化数据科学任务中的潜力，旨在理解任务需求并构建和训练最优适配的机器学习模型。尽管现有LLM智能体已取得广泛成功，但在该场景中仍受限于生成不合理的实验计划。为此，我们提出DS-Agent——一种融合LLM智能体与案例推理（CBR）的新型自动化框架。在开发阶段，DS-Agent遵循CBR框架构建自动化迭代流水线，既能灵活利用Kaggle中的专家知识，又能通过反馈机制驱动性能持续提升。此外，DS-Agent通过简化的CBR范式实现低资源部署阶段，可直接将开发阶段积累的成功方案适配为代码生成，显著降低了对LLM基础能力的需求。实验表明，采用GPT-4的DS-Agent在开发阶段成功率达100%，在部署阶段跨替代LLM的平均一次性通过率提升36%；两个阶段均取得最佳性能排名，每次运行成本分别仅需1.60美元和0.13美元（基于GPT-4）。我们的代码已开源至https://github.com/guosyjlu/DS-Agent。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日