Public research results on large-scale supervised finetuning of AI agents remain relatively rare, since the collection of agent training data presents unique challenges. In this work, we argue that the bottleneck is not a lack of underlying data sources, but that a large variety of data is fragmented across heterogeneous formats, tools, and interfaces. To this end, we introduce the agent data protocol (ADP), a light-weight representation language that serves as an "interlingua" between agent datasets in diverse formats and unified agent training pipelines downstream. The design of ADP is expressive enough to capture a large variety of tasks, including API/tool use, browsing, coding, software engineering, and general agentic workflows, while remaining simple to parse and train on without engineering at a per-dataset level. In experiments, we unified a broad collection of 13 existing agent training datasets into ADP format, and converted the standardized ADP data into training-ready formats for multiple agent frameworks. We performed SFT on these data, and demonstrated an average performance gain of ~20% over corresponding base models, and delivers state-of-the-art or near-SOTA performance on standard coding, browsing, tool use, and research benchmarks, without domain-specific tuning. All code and data are released publicly, in the hope that ADP could help lower the barrier to standardized, scalable, and reproducible agent training.
翻译:关于AI智能体大规模监督微调的公开研究成果仍相对稀缺,这主要是因为智能体训练数据的收集面临独特挑战。本研究认为,瓶颈并非在于缺乏底层数据源,而是大量数据分散在异构的格式、工具和接口中。为此,我们提出了智能体数据协议(ADP),一种轻量级的表示语言,充当多样化格式的智能体数据集与下游统一智能体训练流程之间的“中间语言”。ADP的设计具有足够的表达能力,能够涵盖多种任务,包括API/工具使用、浏览、编码、软件工程及通用智能体工作流,同时保持解析和训练的简便性,无需针对每个数据集进行工程化处理。在实验中,我们将13个现有智能体训练数据集统一转换为ADP格式,并将标准化的ADP数据转化为适用于多个智能体框架的训练就绪格式。我们对这些数据进行了监督微调,结果显示,相较于相应的基础模型,平均性能提升约20%,并在标准编码、浏览、工具使用和研究基准测试中达到了最先进或接近最先进的性能,且无需领域特定调优。所有代码和数据均已公开,期望ADP有助于降低标准化、可扩展和可复现的智能体训练门槛。