Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce \textbf{AgentOhana} as a comprehensive solution to address these challenges. \textit{AgentOhana} aggregates agent trajectories from distinct environments, spanning a wide array of scenarios. It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training. Leveraging the data unification, our training pipeline maintains equilibrium across different data sources and preserves independent randomness across devices during dataset partitioning and model training. Additionally, we present \textbf{xLAM-v0.1}, a large action model tailored for AI agents, which demonstrates exceptional performance across various benchmarks. Begin the exploration at \url{https://github.com/SalesforceAIResearch/xLAM}.
翻译:由大语言模型驱动的自主智能体引起了广泛研究关注。然而,由于多轮轨迹数据来源存在异质性特征,充分挖掘大语言模型在智能体任务中的潜力面临固有挑战。本文提出**AgentOhana**作为应对这些挑战的综合解决方案。\textit{AgentOhana}汇聚来自不同环境、涵盖多种场景的智能体轨迹,将这些轨迹精细标准化并统一为一致格式,从而简化面向智能体训练的通用数据加载器构建。基于数据统一,我们的训练管线能够保持不同数据源之间的平衡,并在数据划分和模型训练过程中跨设备维护独立的随机性。此外,我们提出了专为AI智能体定制的大动作模型**xLAM-v0.1**,该模型在多个基准测试中展现出卓越性能。探索入口详见\url{https://github.com/SalesforceAIResearch/xLAM}。