APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

Zuxin Liu,Thai Hoang,Jianguo Zhang,Ming Zhu,Tian Lan,Shirley Kokane,Juntao Tan,Weiran Yao,Zhiwei Liu,Yihao Feng,Rithesh Murthy,Liangwei Yang,Silvio Savarese,Juan Carlos Niebles,Huan Wang,Shelby Heinecke,Caiming Xiong

The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scalable and structured manner. Each data in our dataset is verified through three hierarchical stages: format checking, actual function executions, and semantic verification, ensuring its reliability and correctness. We demonstrate that models trained with our curated datasets, even with only 7B parameters, can achieve state-of-the-art performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models. Moreover, our 1B model achieves exceptional performance, surpassing GPT-3.5-Turbo and Claude-3 Haiku. We release a dataset containing 60,000 high-quality entries, aiming to advance the field of function-calling agent domains. The dataset is available on Huggingface: https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k and the project homepage: https://apigen-pipeline.github.io/

翻译：函数调用智能体模型的发展需要多样化、可靠且高质量的数据集。本文提出APIGen——一种专为函数调用应用合成可验证高质量数据集的自动化数据生成流水线。我们利用APIGen收集了涵盖21个不同类别的3,673个可执行API，以可扩展且结构化的方式生成多样化函数调用数据集。数据集中每个条目均通过三层验证阶段：格式检查、实际函数执行和语义验证，确保其可靠性与正确性。实验表明，使用我们构建的数据集训练的模型（仅含70亿参数）能在伯克利函数调用基准测试中达到最先进性能，超越多个GPT-4模型。此外，我们的10亿参数模型表现出卓越性能，优于GPT-3.5-Turbo和Claude-3 Haiku。我们发布了包含60,000条高质量条目的数据集，旨在推动函数调用智能体领域的发展。数据集发布于Huggingface平台：https://huggingface.co/datasets/Salesforce/xlam-function-calling-60k 及项目主页：https://apigen-pipeline.github.io/

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日